Wordpacks and Dictionary

Text mining is based on large number of terminologies searched in the text. In the following we go through how to create, import or simply use HubScience's builti-n dictionaries.

 

There are two different notions here: wordpack and dictionary.  Let's see first what we mean about them.

 

Wordcard


Smallest element here is the wordcard. If you would like to find a word or term in the text mining process you need to add it to system on a wordcard. The word card contains the term, its synonyms, its abbreviations as the most important properties. These will be also find it in the text and will be attached together.

You can add other arbitrary properties here, e.g. pubmed id, but you can choose any. They serve the purpose to browse the resulted annotations easier showing this little extra information.

Wordpack

 

Wordpack is a  list word cards from the same sort. What is the same sort is defined by you, but take into consideration the followings:

   Under one category you can add several worpacks so you don't need to think about a wordpack as the whole category. For example, you can make a wordpack with the name 'My important diseases' and another with the name 'Rare diseases', they can be both under the same 'Disease' category

 

   Wordpacks can be added into different projects independently. It means that you want to use in one project only the 'My important diseases' and another 'Rare diseases' and in a third one both, then you are free to do this way.

 

   Terms in wordpacks will be matched to the text similar way. It means that better to separate wordcards of very different type. Just one example, but it might make it clear that person names must be capitalized, while for disease names it really doesn't matter.

 

Some typical wordpacks in biomed.

Dictionary

We offer you to add/delete/edit wordcards, export or import into csv. For making import easier you can download our Excel templates here. With your speadsheet application you can fill it up and export into CSV. This CSV can be important into our wordpacks.

 

Simple list

List with properties

Dictionary is all the wordpacks together in a project. It is good to see them together, browse them and have the possibilities to look up if needed.

 

 

  • You can look up words by category or by using the "Search word" field.

  • It can be useful to look up only among the user added words.

  • Filter by category can make the search easier. Use the "All Categories" dropdown menu.

  • It is possible to turn off the built in dictionary by unchecking the “Show built in dictionary” checkbox.

  • All words in the built in dictionary have a lock icon to show that these cannot be deleted.

  • You can edit here the basic information of word by using the "Add" button. You can also add new properties with the "New property" button.

  • Default properties are:

    • Created​ (date)

    • Created by

    • Abbreviations

    • MESH ID - identifier for Descriptor or Supplementary concept in the Medical Subject Headings controlled vocabulary

    • Wikidata ID -  unique identifier (UID) used in Wikidata

Don't forget to "Save" the changes you made!