Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

The method to change a PDF file to an Excel file using C# is to use a third-party library like iTextSharp or Aspose.PDF to extract the data from the PDF and convert it to an Excel file format. The following are the steps to convert a PDF file to an Excel file:

  1. Add the iTextSharp or Aspose.PDF library to the project.

  2. Load the PDF file into the memory.

  3. Extract the data from the PDF using the library's API.

  4. Create a new Excel file and add the extracted data to it.

  5. Save the Excel file to the disk.

Here is an example code using iTextSharp library:

using iTextSharp.text.pdf;
using iTextSharp.text.pdf.parser;
using System.IO;
using Excel = Microsoft.Office.Interop.Excel;

public void ConvertPDFToExcel(string pdfFilePath, string excelFilePath)
{
    // Load the PDF file into the memory
    using (var pdfReader = new PdfReader(pdfFilePath))
    {
        // Extract the data from the PDF
        var text = new StringBuilder();
        for (int i = 1; i <= pdfReader.NumberOfPages; i++)
        {
            var pageText = PdfTextExtractor.GetTextFromPage(pdfReader, i);
            text.Append(pageText);
        }

        // Create a new Excel file and add the extracted data to it
        var excelApp = new Excel.Application();
        var workbook = excelApp.Workbooks.Add();
        var worksheet = (Excel.Worksheet)workbook.Worksheets[1];
        var lines = text.ToString().Split('\n');
        for (int row = 0; row < lines.Length; row++)
        {
            var cells = lines[row].Split('\t');
            for (int col = 0; col < cells.Length; col++)
            {
                worksheet.Cells[row + 1, col + 1] = cells[col];
            }
        }

        // Save the Excel file to the disk
        workbook.SaveAs(excelFilePath);
        workbook.Close();
        excelApp.Quit();
    }
}