PDF2Tag analyzes the text of your PDF and it tags it against a choosen SKOS Thesaurus, whose data are precisely gathererd from the SPARQL-ENDPOINT hosting it. After its text is extracted, analyzed and tagged you will be able to download a text analysis, a match list and the tagged PDF. All computed results are cached for later re-use.
PDF2Tag tags on demand with several colors depending to which top concept a word is found to belong.
Quick Start
- Choose a thesaurus and a SPARQL endpoint
- Define some PDF files to tag
- Start tagging
- Examine / download the four tag results for each PDF
- Use the provided sample PDF's
|
|
|
Im Detail
- Chose from the combo box above "SPARQL ENDPOINT:" a thesaurus and a SPARQL endpoint - please pay attention to the thesaurus languages, since the wrong language might cause zero matches and hence zero tags.
- Either you load some PDF files from your file system by pressing the button "Choose PDF files" or you drag&drop some PDF files from a file system explorer into the region below the buttons. At the end of this operation the choosen files are listed below the buttons and the Tagging button is activated (orange).
- To start tagging please press the button "Upload..." which must be orange. This is colored orange when at least one PDF file is listed below the buttons. Please wait some seconds for the results - depending on the size of the PDF.
- For each PDF four results are computed: a) the tagged PDF with SKOS annotations, b) a text analyse with a match terms occurrence statistics, c) a match list and d) a SKOS expanded match list.
- In case you have no PDF files to upload, we provided some sample PDF's to be uploaded and tagged. Please hover on the text "...prepared PDF samples" and check the box for the corresponding file to be uploaded. If this for some reason does not function, we provided a sample tagged PDF to download and examine: Please hover on the text "Example of a tagged PDF" and click on the appeared PDF icon to download it.
- Please examine each tagged PDF in a external PDF viewer which is not inline to your browser. Inline PDF views will not show the SKOS annotations.
|