Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

Python can be used to select certain pages of a PDF document and save them separately using the PyPDF2 library. Here's an example code:

import PyPDF2

# Open the PDF file in read-binary mode
with open('example.pdf', 'rb') as pdf_file:
    # Create a PDF reader object
    pdf_reader = PyPDF2.PdfFileReader(pdf_file)

    # Select the pages to extract
    page_nums = [1, 3, 5]  # Extract pages 1, 3 and 5
    extracted_pages = [pdf_reader.getPage(i-1) for i in page_nums]  # i-1 to get the correct page index

    # Create a PDF writer object
    pdf_writer = PyPDF2.PdfFileWriter()

    # Add the selected pages to the writer object
    for page in extracted_pages:
        pdf_writer.addPage(page)

    # Save the extracted pages as a new PDF file
    with open('extracted_pages.pdf', 'wb') as output_file:
        pdf_writer.write(output_file)

In this example, we first open the PDF file in read-binary mode and create a PDF reader object. We then select the pages to extract and create a list of PdfFilePage objects using the getPage() method. Next, we create a PDF writer object and add the selected pages to it using the addPage() method. Finally, we save the extracted pages as a new PDF file using the write() method of the writer object.

Note that page numbers in PDF files are 1-based, so we need to subtract 1 from the page number before passing it to the getPage() method. Also, this code only works for PDF files that are not password-protected or encrypted.