Skip to content

Custom input and output

gal kahana edited this page May 12, 2023 · 3 revisions

You might be looking to create a PDF in a memory buffer instead of a file. You might be looking to retrieve the JPG images for embedding from the DB, and not from files. In any of these cases, you would prefer using Streams instead of files, where the streams can be one of those defined by the library or your own custom implementation.

The library allows stream input and output for most of its file related functionality. You can do the following:

  • Stream PDF output - Create PDF output to any IByteWriterWithPosition stream. Note that the library PDF content is written in a single pass, so you can trust that there will be no position changes. This is good if you are looking to implement decoder or streams as the underlying stream implementation.
  • Create the log traces in a stream - This allows you to save the log in a non-file form. For instance, pass it to the DB. you can also do compression on the log using this method, combining with either the library compression algorithms or your own.
  • Embed JPGs, PDFs, PNGs and TIFFs - You can use stream input for embedding images in all supported library images input. A good example for usage is when images are stored in a DB. Note that as opposed to Output, which doesn't require stream random access, Images Input does require the underlying stream implementation to support random access (re-positioning of the read pointer).

The following goes through the various method dealing with Stream input and output.

PDF Output

To emit PDF to any IByteWriterwithPosition implementation, use the following methods of PDFWriter:

EStatusCode StartPDFForStream(IByteWriterWithPosition* inOutputStream,
			      EPDFVersion inPDFVersion,
			      const LogConfiguration& inLogConfiguration,
			      const PDFCreationSettings& inPDFCreationSettings);
EStatusCode EndPDFForStream();

Note that the above StartPDFForStream is not that different from StartPDF used for file output. The only difference is the first parameter that is a pointer to an IByteWriterWithPosition. This is a stream implementation, that supports reading the current position (but not changing it). For a discussion of streams see IO.

To finish a PDF created to a stream use the EndPDFForStream method. Make sure NOT to use EndPDF, which should be used only for a PDF file output workflow.

If you are considering stopping PDF file creation, using the State Saving feature, you should still use the Shutdown method, but when continuing use the ContinuePDFForStream method instead of ContinuePDF. It looks like this:

EStatusCode ContinuePDFForStream(IByteWriterWithPosition* inOutputStream,
				 const string& inStateFilePath,
				 const LogConfiguration& inLogConfiguration);

Note that the only difference from ContinuePDF is that it takes a stream as the first parameter, for the file output.

Log

In Logging and Tracing there is a discussion of the library logging methods. The log is created by traces from the library (and you can use it too, of course, for your own usages). You can customize log output to be to any IByteWriter output. Note that here, as opposed to writing PDFs, you only need an IByteWriter implementation, which requires only a single writing method, and not a IByteWriterWithPosition which is IByteWriter subclass requiring also getting the current position. This allows you to make a customization which does not even have the features of a stream, such as a position. For instance, having learned the format of trace messages, you could store them in a DB table.

To customize the log output, use the LogConfiguration structure passed to the PDF starting methods. In the LogConfiguration structure passed, set the LogStream member to the desired stream (you can use the specific constructor for this action.

TIFF, PNG and JPG Images input

Image may be embedded using either a file path, as we saw in Images Support, or a stream source. This is particularly useful when your images are not actually stored in files, but say in a Database. When streams are used, image input comes from implementations of the IByteReadWithPosition interface. This interface requires implementation that allows for reading as well as getting and setting a read position pointers. This is due to the random access nature of reading TIFFs and JPGs (as well as PDF, as we'll see soon). While PNG does not require that, i'm using the same interface, for the sake of alignment.

The following PDFWriter methods should be used for embedding images. Note that all have a matching file input option:

  
// Create Image XObject from JPG
PDFImageXObject* CreateImageXObjectFromJPGStream(IByteReaderWithPosition* inJPGStream);
PDFImageXObject* CreateImageXObjectFromJPGStream(IByteReaderWithPosition* inJPGStream,
                                                 ObjectIDType inImageXObjectID);

// Create Form XObject from JPG
PDFFormXObject* CreateFormXObjectFromJPGStream(IByteReaderWithPosition* inJPGStream);
PDFFormXObject* CreateFormXObjectFromJPGStream(IByteReaderWithPosition* inJPGStream,
                                               ObjectIDType inFormXObjectID);
	
// Create Form XObject from TIFF
PDFFormXObject* CreateFormXObjectFromTIFFStream(IByteReaderWithPosition* inTIFFStream,
						const TIFFUsageParameters& inTIFFUsageParameters);
PDFFormXObject* CreateFormXObjectFromTIFFStream(IByteReaderWithPosition* inTIFFStream,
					        ObjectIDType inFormXObjectID,
						const TIFFUsageParameters& inTIFFUsageParameters);
// Create Form XObject from PNG
PDFFormXObject* CreateFormXObjectFromPNGStream(IByteReaderWithPosition* inPNGStream);
PDFFormXObject* CreateFormXObjectFromPNGStream(IByteReaderWithPosition* inPNGStream,
					        ObjectIDType inFormXObjectID);

Note that in all methods, the only difference from the matching File based methods is the first parameter which is an IByteReadWithPosition interface implementation.

PDF file embedding

One of the important inputs for the library are PDF files. Well, in a similar manner to other image input, you can also have input from streams for PDF. Note that there are quite a few usages for PDF input (be it appending as pages, using as xobjects, merging pages, with the copying context, or not) and all of them have the stream input option. Similar to images the input must also be IByteReaderWithPosition implementation.

The following lists the methods in PDFWriter for stream input:

// Creating an XObject from a PDF input
EStatusCodeAndObjectIDTypeList CreateFormXObjectsFromPDF(IByteReaderWithPosition* inPDFStream,
							 const PDFPageRange& inPageRange,
							 EPDFPageBox inPageBoxToUseAsFormBox,
							 const double* inTransformationMatrix,
							 const ObjectIDTypeList& inCopyAdditionalObjects);
	
EStatusCodeAndObjectIDTypeList CreateFormXObjectsFromPDF(IByteReaderWithPosition* inPDFStream,
							 const PDFPageRange& inPageRange,
							 const PDFRectangle& inCropBox,
							 const double* inTransformationMatrix,
							 const ObjectIDTypeList& inCopyAdditionalObjects);

// Append pages from PDF
EStatusCodeAndObjectIDTypeList AppendPDFPagesFromPDF(IByteReaderWithPosition* inPDFStream,
						     const PDFPageRange& inPageRange,
						     const ObjectIDTypeList& inCopyAdditionalObjects);

// Merge pages from PDF with target PDF pages
EStatusCode MergePDFPagesToPage(PDFPage* inPage,
				IByteReaderWithPosition* inPDFStream,
				const PDFPageRange& inPageRange,
				const ObjectIDTypeList& inCopyAdditionalObjects);


// Create a copying context for a PDF input
PDFDocumentCopyingContext* CreatePDFCopyingContext(IByteReaderWithPosition* inPDFStream);

Note that all methods are simple overloads of the file based input methods, simply with an IByteReaderWithPosition interface implementation instead of a file path.

Summary

we saw that you can use streams for input and output, instead of regular file operations. You can see some examples for such usages in the library tests, log output, image and pdf input and pdf output

Clone this wiki locally