converting pdf reference list to bibtex

Table of Contents

1. Rationale

1.1. Bibtex is how I use references

bibtex is how I reference/cite papers: citar pick them up with citar-bibliography, make my

  • org-insert-cite insert link,
  • #+print_blibliography: print the bibliography/references section in html and latex pdf export, and
  • give my cite links follow action of opening files, urls and most importantly creating and opening org-roam nodes.

1.2. PDF reference list is what you get from papers

The most reliable way to get papers to read about is by looking at the references list section of a known paper.

You’ll get a list of entries like this

Xinli Yu, Zheng Chen, Yuan Ling, Shujing Dong, Zongyi Liu, and Yanbin Lu. 2023. Temporal data meets llm–explainable financial time series forecasting. arXiv preprint arXiv:2306.11025.

1.3. DOI uniquely identifies a paper

The most important field in a bibtex reference entry is the DOI, if the reference have one. They usually get one if they are published within the past 20 years and have been around for about a year(so some newest papers don’t have DOI).

If you have the DOI, reference mangers like zotero and jabref can pull the bibtex informations from the doi link and build the bibtex for you, zotero can even fetch open access pdf for you with it

2. Methods

2.1. [best] sementic scholar -> save to library -> library export

It turns out hat

2.2. web of science, crossref’s references feature

Seems that there’s some websites providing this service already with the source paper(the paper you get the reference list from). But it has to be around for a while for those sites to have them listed. Clearly 1 month is not enough

2.3. crossref -> plain text -> DOIs -> bibtex

Crossref have this nice feature to find DOIs of a list of single-line references. https://search.crossref.org/search/references limits:

  • every reference entry must be on one line
  • it take a long time to run. And
  • it times out when it have taken too long, and 100 clearly is too much (but in the field of AI that’s normal)

2.4. prophy -> DOIs on webpages -> zotero connector

Prophy have the newest paper parsed with links to each entry on their references lists. Sementic Scholar do not have DOI listed now, so it is kind of painful to use, as PDF is slightly harder to get.

2.5. scholarcy summary -> links to references -> zotero connector

Scholarcy have this feature in its article summarizer bookmarklet to parse references section to single-lined reference entries with links to their respective sources(google scholar, arxiv, etc.)

2.6. scholarcy bibtex export

In their webapp version of the article summarizer, They have bibtex export of the references section. The quality is similar to anystyle. No DOI

2.7. anystyle -> plain text references -> bibtex

Backlinks

Author: Linfeng He

Created: 2024-04-03 Wed 23:23