Help ! Need app suggestion regarding PDF highlight extraction

Is anyone aware of an app that will allow you to retrieve the position of the highlighted piece of text from a PDF ?
If i can generate a link that will open a pdf viewer and take me to the exact position of that highlight , that would be amazing .

I am in need of such a feature to complete my digital Zettelkasten setup .

How do you generally keep your bibliography box and slip box connected ? How do you keep you bib slip connected to the refernce ?

2 Likes

Interesting challenge. I imagine there is some tool to do it, but I’m not directly familiar with an exact match. That said, I use PDF-XChange Editor (most features free, but a Pro version has even more). It’s Windows-only, unfortunately. But it has a variety of features that might partially accomplish your goal, including Bookmarks in PDFs, Comments with Highlight, Underline, etc. (for a specific piece of text), etc. The Comments in particular can be exported and do include location info in the file. It appears to be in a standard Adobe (originators of the PDF format) file type and syntax, .fdf or .xfdf. Here’s a sample of what’s in the export, for what it’s worth:

.xfdf format
<?xml version="1.0" encoding="UTF-8"?>
<xfdf xmlns="http://ns.adobe.com/xfdf/" xml:space="preserve"
><f href="letter_birmingham_jail.pdf"
/><ids original="FC5DCDA4E0FBC17AFB7BA74B82D79165" modified="FC5DCDA4E0FBC17AFB7BA74B82D79165"
/><annots
><link Highlight="Invert" coords="114.96,410.16192,519.91704,410.16192,519.91704,419.99208,114.96,419.99208,72,400.08192,138.12,400.08192,138.12,409.91208,72,409.91208" page="0" date="D:20201122112716-08'00'" name="d6da296a-1657-4c20-8439778466f201ff" rect="72,400.08192,519.91704,419.99208" color="#000000" width="0"
/><link Highlight="Invert" coords="72,360.24192,503.86272,360.24192,503.86272,370.07208,72,370.07208,72,350.16192,131.10864,350.16192,131.10864,359.99208,72,359.99208" page="0" date="D:20201122112733-08'00'" name="f2b4a918-037e-4add-8677dd6cca5b8a9c" rect="72,350.16192,503.86272,370.07208" color="#000000" width="0"
/><highlight coords="72,667.91208,176.7,667.91208,72,658.08192,176.7,658.08192" title="Oshyan" creationdate="D:20201122113213-08'00'" subject="Highlight" page="0" date="D:20201122113223-08'00'" flags="print" name="b731ef27-9983-4267-8ea34b5d797d40c3" rect="70.048937,658.08191,178.651061,667.91211" color="#FFFF00"
><popup open="no" page="0" date="D:20201122113224-08'00'" flags="invisible,nozoom,norotate" name="d7ceec0d-6855-406a-9a6597c2809f4084" rect="389.5,542.5,539.5,622.5"
/></highlight
><text icon="Check" inreplyto="b731ef27-9983-4267-8ea34b5d797d40c3" title="Oshyan" creationdate="D:20201122113223-08'00'" subject="Sticky Note" page="0" date="D:20201122113223-08'00'" flags="print,nozoom,norotate" name="f939f71d-f62a-4ade-bb68ab68de77b62f" rect="0,-18,20,0" color="#FFFF00"
><popup open="yes" page="0" date="D:20201122113223-08'00'" flags="invisible,nozoom,norotate" name="71627f7b-0ed6-4a6f-92a89e9807ff0b48" rect="0,0,0,0"
/></text
><underline coords="72,519.59208,122.70864,519.59208,72,509.76192,122.70864,509.76192" title="Oshyan" creationdate="D:20201122113251-08'00'" subject="Underline" page="0" date="D:20201122113251-08'00'" flags="print" name="8e1d9b3e-873f-47d3-80e2b064126d6df1" rect="72,510.657807,122.708642,511.657807" color="#00FF00"
/></annots
></xfdf
>
.fdf format
%FDF-1.4
%âãÏÓ
1 0 obj
<<
/FDF <<
/F (letter_birmingham_jail.pdf)
/ID [(ü]ͤàûÁzû{§K‚בe) (ü]ͤàûÁzû{§K‚בe)]
/UF (letter_birmingham_jail.pdf)
/Type /Catalog
/Annots [2 0 R 3 0 R 4 0 R 5 0 R 6 0 R 7 0 R 8 0 R]
>>
>>
endobj
2 0 obj
<<
/A <<
/D [0 /XYZ 0 792 null]
/S /GoTo
>>
/C [0 0 0]
/M (D:20201122112716-08'00')
/NM (d6da296a-1657-4c20-8439778466f201ff)
/Page 0
/Rect [72 400.08192 519.91704 419.99208]
/Border [0 0 0]
/Subtype /Link
/QuadPoints [114.96 410.16192 519.91704 410.16192 519.91704 419.99208 114.96 419.99208 72 400.08192 138.12 400.08192 138.12 409.91208 72 409.91208]
>>
endobj
3 0 obj
<<
/A <<
/D [0 /XYZ 0 792 null]
/S /GoTo
>>
/C [0 0 0]
/M (D:20201122112733-08'00')
/NM (f2b4a918-037e-4add-8677dd6cca5b8a9c)
/Page 0
/Rect [72 350.16192 503.86272 370.07208]
/Border [0 0 0]
/Subtype /Link
/QuadPoints [72 360.24192 503.86272 360.24192 503.86272 370.07208 72 370.07208 72 350.16192 131.10864 350.16192 131.10864 359.99208 72 359.99208]
>>
endobj

So given that the PDF format has bookmarks/links/etc. functions as part of the spec, I guess a PDF editor that supports that may be your best chance of such a feature. It looks like some of the better Mac PDF apps are:

And this open-source one that specifically says it’s for reading and annotating scientific papers:
https://skim-app.sourceforge.io/

I know that doesn’t definitively answer your question, but hopefully it helps you get a step closer. Let us know if you do ultimately find a great tool for this!

1 Like

Maybe the premium version of Diigo.

1 Like

Thanks for the effort ! This is a good technique that I had never explored .

Unfortunately , I want to be as minimalistic as possible with respect to the apps I use in my workflow and so wouldn’t want another app to read the comments/ highlights I make in the PDFs.

In my ideal situation , I should be able to generate a link (local) for my highlight/comment in the pdf and I should be able to paste it in the note-taking app of my choice , as bibliography metadata for my slip. This would enable me to refer to the source material and by literature notes (comments) without having to maintain a separate slip box for both .

I have a feeling that the app named hook , might be able to help me with that . But haven’t gone down that rabbit hole yet . Will update once I try .

UPDATE : I just checked https://hookproductivity.com/help/general/features/ and I think it’s almost there . It links almost every file in the MacOs with each other , has backlinks as well .

It’s like Roam OS.

Unfortunately , doesn’t have deep linking to the text level of PDFs. But there is an update that’s coming this year which promises to do just that . Will have an eye out for that .

Meanwhile , I found this other app that does the text level linking : https://apps.apple.com/in/app/pdfoo/id540272061?mt=12

Will update after I have tried it .

Unfortunately , there’s no mention of being able to export the annotated sections of PDFs as links anywhere in the website . Am I missing something ?
Also , this came as a surprise to me . Is this a web annotation tool like hypothesis.is and world brain’s memex ? I have never heard of this and the website says they have more than 9 M + users . That is a LOT .

1 Like

I was trying with Diigo here. You should be able to highlight text inside the Pdfs and create a share link for each highlight, but looks like isn’t working. I’ll ask from support why isn’t working.

2 Likes

My app (Keypoints, see my intro here) tries to connect note taking, PDF annotation & reference management. It can be used as a standalone app but can also work as a hub to feed into other apps. Each note extracted from a PDF highlight gets its own URL. Clicking such an URL will select the note in the app and open the corresponding PDF with the highlighted text next to it.

I realize that you only want to use a single app, but since you’d likely need a PDF viewer anyhow(?), maybe this would still be a feasible solution? My app isn’t yet available, but I wanted to mentioned it here since I wrote this app exactly for this use case.

1 Like