Python can be used to select certain pages of a PDF document and save them separately using the PyPDF2 library. Here's an example code:
import PyPDF2
# Open the PDF file in read-binary mode
with open('example.pdf', 'rb') as pdf_file:
# Create a PDF reader object
pdf_reader = PyPDF2.PdfFileReader(pdf_file)
# Select the pages to extract
page_nums = [1, 3, 5] # Extract pages 1, 3 and 5
extracted_pages = [pdf_reader.getPage(i-1) for i in page_nums] # i-1 to get the correct page index
# Create a PDF writer object
pdf_writer = PyPDF2.PdfFileWriter()
# Add the selected pages to the writer object
for page in extracted_pages:
pdf_writer.addPage(page)
# Save the extracted pages as a new PDF file
with open('extracted_pages.pdf', 'wb') as output_file:
pdf_writer.write(output_file)
In this example, we first open the PDF file in read-binary mode and create a PDF reader object. We then select the pages to extract and create a list of PdfFilePage
objects using the getPage()
method. Next, we create a PDF writer object and add the selected pages to it using the addPage()
method. Finally, we save the extracted pages as a new PDF file using the write()
method of the writer object.
Note that page numbers in PDF files are 1-based, so we need to subtract 1 from the page number before passing it to the getPage()
method. Also, this code only works for PDF files that are not password-protected or encrypted.
Asked: 2023-06-26 04:27:57 +0000
Seen: 8 times
Last updated: Jun 26 '23