for every pdf file in a directory find its doi and create a bibtex entry for it in matlab

To achieve this task in MATLAB, you can follow the steps below:

  1. First, make sure you have an internet connection and the DOI of each PDF file you want to create a BibTeX entry for. If you don't have the DOIs, you may need to manually search for them.

  2. Use the dir function to get a list of all the PDF files in the directory. Here's an example of how you can do it:

    main.m
    pdfDir = 'path/to/directory'; % Replace with the actual directory path
    pdfFiles = dir(fullfile(pdfDir, '*.pdf'));
    
    114 chars
    3 lines
  3. Loop through each PDF file and extract the DOI by reading the text content using the pdf2text function. This function enables you to extract text content from PDF files in MATLAB. To install this function, you can use the File Exchange by executing the following command in MATLAB's command window:

    main.m
    web('https://www.mathworks.com/matlabcentral/fileexchange/61854-pdf2text');
    
    76 chars
    2 lines

    After installing the function, you can extract the text content and search for the DOI using regular expressions. Here's an example:

    main.m
    for i = 1:numel(pdfFiles)
        pdfPath = fullfile(pdfDir, pdfFiles(i).name);
        textContent = pdf2text(pdfPath);
        doiTokens = regexp(textContent, '(10\.\d{4,}\/[-._;()/:A-Z0-9]+)', 'match');
        
        % Assuming there's only one DOI per PDF file
        if ~isempty(doiTokens)
            doi = doiTokens{1};
            % Create the BibTeX entry using the DOI
            bibtexEntry = ['@article{,', char(10), '  doi = {', doi, '}', char(10), '}'];
            
            % Do whatever you want with the BibTeX entry here
            disp(bibtexEntry);
        end
    end
    
    543 chars
    16 lines

    In this example, we assume that the DOI pattern matches the standard format for DOIs. However, you may need to modify the regular expression pattern as per your specific PDFs format.

  4. After retrieving the DOI and creating the BibTeX entry, you can decide how you want to save or handle the BibTeX entry accordingly.

Remember to replace 'path/to/directory' with the actual directory path where your PDF files are located. Additionally, make sure to handle any error cases that may arise during the process.

Please note that the pdf2text function might not work well with all types of PDF files, particularly if they contain complex formatting or non-text-based content.

gistlibby LogSnag