Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NLP / Named Entity Recognition + Linking #67

Open
Jiros opened this issue Jul 10, 2020 · 4 comments
Open

NLP / Named Entity Recognition + Linking #67

Jiros opened this issue Jul 10, 2020 · 4 comments
Labels
Epic Priority: High This issue should be dealt with as soon as possible Tag: Data Processing Tag: Help Wanted Extra attention is needed

Comments

@Jiros
Copy link
Contributor

Jiros commented Jul 10, 2020

We need additional expertise to assist with a number of issues that could benefit from NLP/Entity recognition. Our goal is to identify and create more meaningful relationships in currently unconnected sub-graphs within CovidGraph.

Our ethos is open and transparent so we would prefer open source solutions.

For example, within publication & patent text to identify:

  • gene names
  • drug names
  • synonyms for people/institutions
  • synonyms for locations
@Jiros Jiros added the Epic label Jul 10, 2020
@Jiros Jiros changed the title NLP / Entity Recognition NLP & Named Entity Recognition + Linking Jul 10, 2020
@Jiros Jiros added Type: Data Analysis To identify an issue as data analysis Tag: Help Wanted Extra attention is needed Priority: High This issue should be dealt with as soon as possible labels Jul 10, 2020
@Jiros Jiros changed the title NLP & Named Entity Recognition + Linking NLP / Named Entity Recognition + Linking Jul 10, 2020
@mpreusse
Copy link
Member

Here is a publication with some interesting ressources for our NLP tasks: https://www.nature.com/articles/s41597-020-0543-2

@Jiros
Copy link
Contributor Author

Jiros commented Jul 18, 2020

I added an extract from the introdcution to the Nature article to #35, the researcher use case as it included a good description of what researchers might be looking for from a system like CovidGraph.

@Fohlen
Copy link

Fohlen commented Aug 13, 2020

Hello there everyone! I came here because @yGuy made me aware of the project. BioBERT has been used very successfully on the COVID-19 papers (see https://covidask.korea.ac.kr/). However I think this is still an interesting issue to tackle, specifically it could be tested if the newest advances in RNN architectures (GPT3 by OpenAI) can achieve even better performance. I talked with my supervisor @coltekin (University of Tuebingen) and it seems like a suitable collaboration for my bachelors thesis in computational linguistics.

If this sounds interesting, it would be good to have a discrete list of entities that are really relevant to investigate.
All the best,

Lennard

@motey
Copy link
Member

motey commented Aug 13, 2020

Hi Lennard,

If this sounds interesting

YES! 🚀

i was allready peering at gpt2/3 :)
would be great if you want to have a try on the NLP+Covid*Graph-thing.
The easiest way to bring you up on speed, regarding data model and relevant nodes is maybe if you join our chat via the matrix.org network see https://github.com/covidgraph/documentation/wiki#communication
You can add me via @tim.bleimehl:meet.dzd-ev.de and i can invite you to the relevant groups. from there we could also setup a call with some of us in the data-core group.

Also we are trying to get our hand on some more hardware (GPUs) to speed up computing if applicable in a later stage.

@Jiros Jiros added Tag: Data Processing and removed Type: Data Analysis To identify an issue as data analysis labels Dec 8, 2020
@Jiros Jiros closed this as completed Dec 8, 2020
@Jiros Jiros reopened this Dec 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Epic Priority: High This issue should be dealt with as soon as possible Tag: Data Processing Tag: Help Wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants