What is a Collection?
A Collection is a feature unique to Impira. It's a folder that contains a group of files that have similar layouts and share the same schema (a set of fields you want to extract — e.g., Name, Date, Balance, etc).
However, while standard folders simply hold and organize files, Collections have a brain (i.e., a machine learning model) to actively learn how to extract data from those files.
Creating a Collection is the first step before moving on to extract data from your files.
While Collections can adapt to learn nearly any document layout, docs within a specific Collection should all have the same layouts and contain the fields you want to extract.
How do I determine which files go into which Collection?
Every Collection has a purpose that’s defined by what fields you want to extract. Collections will continuously learn and improve themselves to become more and more accurate for future predictions when you upload new files.
Group your fields according to their layouts and fields. Create a Collection for each group (Example: “I want to extract the fields Vendor, Shipping address, and Delivery date from this group of purchase orders, so I’ll place them in their own Collection.”).
Example 1: Invoices
If you have a group of invoices with the same layout and you want to extract the fields, Name, Date, and Balance Due from all of them, place them in the same Collection.
As you extract these fields and review your predictions, that Collection is trained on those invoices — its purpose is to look for Name, Date, and Balance Due from each invoice in the Collection. When you upload similar new invoices, Impira will immediately get to work extracting the same fields as before.
Example 2: Purchase orders
Let’s say you also had a group of purchase orders with the same layout and fields. You can put them in a new Collection and start extracting data. This new Collection will start learning the new type of document and customize itself by observing your corrections and confirmations.
You can even create a new Collection for that same set of purchase orders if you want to extract different data without disrupting the first Collection. One Collection can be used to extract the fields Names and Dates, and another Collection can be used to extract different fields like Address and Sales Tax.
Types of Collections
A Collection (i.e., a standard or manual Collection) is like a blank canvas where users can add extraction fields to teach the Collection about the data they want to extract. Collections are trained by confirming and correcting predictions in order to make them more accurate for future predictions.
Each Collection becomes customized to extract data from a group of similar files and continually grows more accurate with each user interaction.
Instant Collections are Collections that have been pre-trained on common document types. You can import them into an existing Collection or create a new Instant Collection to help you skip the typical training steps and get a head start on extracting data right away.
For example, if you have a group of ACORD 25 forms you wanted to extract data from, you could create a Collection and import the Instant Collection for ACORD 25 forms. Doing this will outfit your Collection to automatically extract fields and values found on these forms.
Instant Collections are still just as flexible and customizable as a standard Collection — they just help you get going faster.
Read more about Instant Collections.
Setting up a Smart Collection allows you to automatically filter new files into a group of similar files, just like you can with email filters. Newly uploaded files will automatically be routed to a Smart Collection that has been set up and trained to extract the fields you want.
Smart Collections help users get files into the right place and Instant Collections help users extract data more quickly by being pre-trained. You can have a Collection that’s both Smart and Instant.
For example, you could filter any incoming file with “ACORD 25” in the file name to automatically be routed to a Smart Collection. That Collection can also be an Instant Collection set to the ACORD 25 document type. Once an ACORD 25 form arrives into that Collection, it would immediately get to work extracting all the values you need from that form.
Read more about how to set up Smart Collections.
All files is a general repository folder that holds all your uploaded files. You can locate files via search or by filters, as well as see which Collections they belong to. Any unsorted files can be added to Collections in All files.
Sorting files into the right Collections
While you’re in All files, locate the files you want to sort by scrolling, searching, or filtering. Add individual files to a Collection by choosing Add to Collection to the right of the file name.
You can also check the box next to the file name and add file(s) to a Collection by selecting Add X files to Collection.
Either method of adding files to a Collection will lead to an opportunity to choose the destination Collection or to create a new one.
Adding files to multiple Collections
If you’d like to duplicate files in one Collection to another, first locate these files in the original Collection. Select the checkboxes by the files you want to duplicate and choose Add X files to another Collection.
Choose your destination Collection from the drop-down menu and select Add selected files to Collection.
Removing files from a Collection
Open the Collection that contains the files you want to remove and select the checkboxes by their names. Choose Delete X files from Collection.
Renaming a Collection
- Go to your Collection and go to the dropdown menu next to the Collection name.
- Choose Rename Collection.
- Edit your Collection name and click Rename Collection.