Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set fastText submodule to v0.2.0. #55

Open
wants to merge 49 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
f16e463
added a convinience function that returns an array instead of a list …
borissmidt Aug 7, 2017
c576b1f
add Java binding for getSentenceVector
bxshi Jan 23, 2019
e6af0a0
Set fastText submodule to v0.2.0.
Sep 12, 2018
e7449bc
Increase version number to 0.5
Jun 17, 2019
c12efce
Update jdk version to oraclejdk11 so that Mac OS can build. Update RE…
Jun 17, 2019
1efff12
Specify JDK version for Javadoc.
Jun 17, 2019
0e03434
Travis settings.
Jul 23, 2019
aadc4d3
add Java binding for getSentenceVector
bxshi Jan 23, 2019
ee067b0
Merge branch 'master' into float_array
Jul 24, 2019
3d545ce
Add unit test.
Jul 24, 2019
d690608
Merge branch 'master' into getSentenceVector
Jul 24, 2019
ed5eb5a
Add unit test.
Jul 24, 2019
9793d7b
Build on OpenJDK11
Jul 25, 2019
a2d8f93
Merge pull request #2 from carschno/getSentenceVector
carschno Jul 25, 2019
10429dc
Build on OpenJDK11
Jul 25, 2019
58a432b
Merge branch 'master' into float_array
Jul 25, 2019
430486e
Merge pull request #1 from carschno/float_array
carschno Jul 25, 2019
3c468a9
Adapt information to fork.
Jul 25, 2019
ed194e7
Upgrade javacpp to v1.5.1
Jul 25, 2019
ca617a7
Update to JavaCPP v1.5.1
carschno Jul 28, 2019
4fb8599
Bump version to 0.5.0
carschno Jul 28, 2019
6d92a1c
Update example
carschno Jul 28, 2019
bf2f52b
Update instructions.
carschno Jul 28, 2019
7c6cec9
Fix Maven dependency.
carschno Jul 28, 2019
84f13eb
Remove claim for Windows/MacOSX binary support (Closes #5).
carschno Oct 25, 2019
9f60ef2
Remove claim for Windows/MacOSX binary support (Closes #5).
carschno Oct 25, 2019
e7c7c12
Bump version.
carschno Oct 25, 2019
3852102
Update to FastText v0.9.1.
carschno Oct 25, 2019
143f94b
Update to most FastText branch with fix for comparePairs() call.
carschno Oct 25, 2019
685cad2
Include meter.cc.
carschno Oct 25, 2019
5674ed7
Add instructions for Windows, Mac OS (#5).
carschno Oct 26, 2019
6b125f5
Update Maven dependency version.
carschno Oct 26, 2019
a840fec
Merge branch 'master' into fasttext_0.9.1
carschno Oct 26, 2019
5b2e96d
Fix issue #19 - allow to load models from InputStream/URL/URI
alexott Apr 21, 2019
7f9e725
Add test for model file resource loading.
Jul 25, 2019
e7d8847
Commit test model.
carschno Oct 26, 2019
43ad704
Merge pull request #3 from carschno/issue-19-fix
carschno Oct 26, 2019
d0a6614
Merge branch 'master' into fasttext_0.9.1
carschno Oct 26, 2019
012397e
Fix test10ModelFromURL().
carschno Oct 27, 2019
230a513
Update Maven dependency version.
carschno Oct 27, 2019
16d19f7
Merge pull request #7 from carschno/fasttext_0.9.1
carschno Oct 27, 2019
777df81
Add Changelog.
carschno Oct 28, 2019
41b58a1
Release version 0.9.1.
carschno Oct 28, 2019
d4756b5
Update dependency version.
carschno Oct 28, 2019
781b736
Set autoReleaseAfterClose to false.
carschno Oct 28, 2019
18c756b
Add optimized version of getSentenceVector
GotoFinal Feb 18, 2020
762128d
Merge pull request #8 from GotoFinal/patch-1
carschno Feb 18, 2020
0c7b10e
Move release-related plugins into a separate profile
alexott Nov 21, 2020
e0da9e4
Merge pull request #10 from alexott/maven-improvements
carschno Nov 22, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions .travis.yml
Original file line number Diff line number Diff line change
@@ -1,9 +1,8 @@
language: java
jdk:
- oraclejdk8
- openjdk11
os:
- linux
- osx
cache: bundler
# Setting install to 'true' to prevent Travis CI from installing depedencies via:
# "mvn install -DskipTests=true -Dmaven.javadoc.skip=true -B -V" which fails due to missing the GPG secret key.
Expand Down
7 changes: 7 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Changelog

## Version 0.9.1

* Use [FastText v0.9.1](https://github.com/facebookresearch/fastText/releases/tag/v0.9.1)
* Minor bug fixes

32 changes: 25 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,19 @@
[![Build Status](https://travis-ci.org/vinhkhuc/JFastText.svg?branch=master)](https://travis-ci.org/vinhkhuc/JFastText)
[![Build Status](https://travis-ci.org/carschno/JFastText.svg?branch=master)](https://travis-ci.org/carschno/JFastText)

Table of Contents
=================

* [Introduction](#introduction)
* [Maven Dependency](#maven-dependency)
* [Windows and Mac OSX](#windows-and-mac-os-x)
* [Building](#building)
* [Quick Application - Language Identification](#quick-application-\--language-identification)
* [Detailed Examples](#detailed-examples)
* [API](#api)
* [FastText's Command Line](#fasttexts-command-line)
* [License](#license)
* [References](#references)
* [Changelog](CHANGELOG.md)


## Introduction
Expand All @@ -28,23 +30,39 @@ JFastText is ideal for building fast text classifiers in Java.
## Maven Dependency
```xml
<dependency>
<groupId>com.github.vinhkhuc</groupId>
<groupId>io.github.carschno</groupId>
<artifactId>jfasttext</artifactId>
<version>0.4</version>
<version>0.9.1</version>
</dependency>
```
The Jar package on Maven Central is bundled with precompiled fastText library for Windows, Linux and
MacOSX 64bit.
The Jar package on Maven Central is bundled with precompiled fastText library for ~~Windows,~~ Linux ~~and
MacOSX~~ 64bit.

### Windows and Mac OS X

Currently, the Maven dependency only contains binaries for Linux (64 bit), _not_ for Windows or Mac OS X.
In order to use JFastText for Windows or Mac OS X (or any other system), you need to build it yourself (see [below](#building)).

## Building
C++ compiler (g++ on Mac/Linux or cl.exe on Windows) is required to compile fastText's code.
C++ compiler (g++ on Mac/Linux or `cl.exe` on Windows) is required to compile fastText's code.

```bash
git clone --recursive https://github.com/vinhkhuc/JFastText
git clone --recursive https://github.com/carschno/JFastText
cd JFastText
git submodule init
git submodule update
mvn package
```

### Building on Windows

The (automatic) build seems to fail on some Windows systems/C++ compilers.
See [this issue](https://github.com/carschno/JFastText/issues/5#issuecomment-546485377):

> I used MS's developer tools, not the full-blown Visual Studio. If I run `cl` directly, the compilation fails with the same error.
>
> I was able to build on Windows by changing the call to `cl.exe` and running it outside the Maven build. I changed one parameter in the call to `cl`: I use `/MT` (whereas Maven uses `/MD`). Bundling the generated DLLs works fine.

## Quick Application - Language Identification
JFastText can use FastText's pretrained models directly. Language identification models can be downloaded [here](https://fasttext.cc/docs/en/language-identification.html).
In this quick example, we will use the [quantized model](https://s3-us-west-1.amazonaws.com/fasttext-vectors/supervised_models/lid.176.ftz)
Expand Down
9 changes: 6 additions & 3 deletions examples/api/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,15 @@
<groupId>com.github.vinhkhuc</groupId>
<artifactId>java_sandbox</artifactId>
<version>0.1</version>

<properties>
<maven.compiler.source>1.8</maven.compiler.source>
<maven.compiler.target>1.8</maven.compiler.target>
</properties>
<dependencies>
<dependency>
<groupId>com.github.vinhkhuc</groupId>
<groupId>io.github.carschno</groupId>
<artifactId>jfasttext</artifactId>
<version>0.4</version>
<version>0.5.0</version>
</dependency>
</dependencies>
</project>
82 changes: 51 additions & 31 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,9 @@
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>

<groupId>com.github.vinhkhuc</groupId>
<groupId>io.github.carschno</groupId>
<artifactId>jfasttext</artifactId>
<version>0.4</version>
<version>0.9.1</version>
<name>Java interface for fastText</name>
<description>
JFastText is a Java interface for fastText, a library for efficient learning of
Expand All @@ -22,6 +22,15 @@
</licenses>

<developers>
<developer>
<id>carschno</id>
<name>Carsten Schnober</name>
<url>http://github.com/carschno</url>
<roles>
<role>developer</role>
<role>maintainer</role>
</roles>
</developer>
<developer>
<id>vinhkhuc</id>
<name>Vinh Khuc</name>
Expand All @@ -34,13 +43,14 @@
</developers>

<scm>
<connection>scm:git:https://github.com/vinhkhuc/JFastText.git</connection>
<developerConnection>scm:git:git@github.com:vinhkhuc/JFastText.git</developerConnection>
<url>https://github.com/vinhkhuc/JFastText</url>
<connection>scm:git:https://github.com/carschno/JFastText.git</connection>
<developerConnection>scm:git:git@github.com:carschno/JFastText.git</developerConnection>
<url>https://github.com/carschno/JFastText</url>
</scm>

<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<javacpp.version>1.5.1</javacpp.version>
</properties>

<distributionManagement>
Expand All @@ -65,21 +75,42 @@
<additionalparam>-Xdoclint:none</additionalparam>
</properties>
</profile>
</profiles>

<build>
<plugins>
<plugin>
<profile>
<id>release</id>
<build>
<plugins>
<plugin>
<groupId>org.sonatype.plugins</groupId>
<artifactId>nexus-staging-maven-plugin</artifactId>
<version>1.6.3</version>
<extensions>true</extensions>
<configuration>
<serverId>ossrh</serverId>
<nexusUrl>https://oss.sonatype.org/</nexusUrl>
<autoReleaseAfterClose>true</autoReleaseAfterClose>
<serverId>ossrh</serverId>
<nexusUrl>https://oss.sonatype.org/</nexusUrl>
<autoReleaseAfterClose>false</autoReleaseAfterClose>
</configuration>
</plugin>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-gpg-plugin</artifactId>
<version>1.5</version>
<executions>
<execution>
<id>sign-artifacts</id>
<phase>verify</phase>
<goals>
<goal>sign</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
</profile>
</profiles>

<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
Expand Down Expand Up @@ -110,6 +141,9 @@
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-javadoc-plugin</artifactId>
<version>2.9.1</version>
<configuration>
<source>8</source>
</configuration>
<executions>
<execution>
<id>attach-javadocs</id>
Expand All @@ -119,20 +153,6 @@
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-gpg-plugin</artifactId>
<version>1.5</version>
<executions>
<execution>
<id>sign-artifacts</id>
<phase>verify</phase>
<goals>
<goal>sign</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-enforcer-plugin</artifactId>
Expand All @@ -155,7 +175,7 @@
<plugin>
<groupId>org.bytedeco</groupId>
<artifactId>javacpp</artifactId>
<version>1.3.1</version>
<version>${javacpp.version}</version>
<executions>
<execution>
<id>run-javacpp-parser</id>
Expand Down Expand Up @@ -234,8 +254,8 @@
<dependency>
<groupId>org.bytedeco</groupId>
<artifactId>javacpp</artifactId>
<version>1.3.1</version>
<version>${javacpp.version}</version>
</dependency>
</dependencies>

</project>
</project>
2 changes: 1 addition & 1 deletion src/main/cpp/fastText
Submodule fastText updated 115 files
11 changes: 9 additions & 2 deletions src/main/cpp/fasttext_wrapper.cc
Original file line number Diff line number Diff line change
Expand Up @@ -74,13 +74,20 @@ namespace FastTextWrapper {
const std::string& text, int32_t k) {
std::vector<std::pair<real,std::string>> predictions;
std::istringstream in(text);
fastText.predict(in, k, predictions);
fastText.predictLine(in, predictions, k, 0.0);
return predictions;
}

std::vector<real> FastTextApi::getVector(const std::string& word) {
Vector vec(privateMembers->args_->dim);
fastText.getVector(vec, word);
fastText.getWordVector(vec, word);
return std::vector<real>(vec.data(), vec.data() + vec.size());
}

std::vector<real> FastTextApi::getSentenceVector(const std::string& sentence) {
Vector vec(privateMembers->args_->dim);
std::istringstream in(sentence);
fastText.getSentenceVector(in, vec);
return std::vector<real>(vec.data(), vec.data() + vec.size());
}

Expand Down
1 change: 1 addition & 0 deletions src/main/cpp/fasttext_wrapper.h
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ namespace FastTextWrapper {
std::vector<std::string> predict(const std::string&, int32_t);
std::vector<std::pair<real,std::string>> predictProba(const std::string&, int32_t);
std::vector<real> getVector(const std::string&);
std::vector<real> getSentenceVector(const std::string&);
std::vector<std::string> getWords();
std::vector<std::string> getLabels();
std::string getWord(int32_t);
Expand Down
5 changes: 4 additions & 1 deletion src/main/cpp/fasttext_wrapper_javacpp.h
Original file line number Diff line number Diff line change
@@ -1,12 +1,15 @@
// Added <numeric> since VS 14.0 complains about missing std::iota
#include <numeric>
#include "fastText/src/args.cc"
#include "fastText/src/densematrix.cc"
#include "fastText/src/dictionary.cc"
#include "fastText/src/fasttext.cc"
#include "fastText/src/loss.cc"
#include "fastText/src/matrix.cc"
#include "fastText/src/meter.cc"
#include "fastText/src/model.cc"
#include "fastText/src/productquantizer.cc"
#include "fastText/src/qmatrix.cc"
#include "fastText/src/quantmatrix.cc"
#include "fastText/src/vector.cc"
#include "fastText/src/utils.cc"

Expand Down
Loading