ParseXtract - PX

securibox Parse Xtract

Automated data extraction.

Compatible with Image or PDF, PX allows you to extract structured data from semi-structured documents.

View Documentationarrow_right_alt

Features

$
refresh

InReal-time

Automatically extract document data through real-time processing

select_all

Improve quality

Avoid the manual processing related errors.


euro

Reduce costs

No manual entry associated costs.

tune

Easy integration

Smoothly integrate into your process with just 3 lines of code

keyboard_arrow_downHow it works
Studio

Get structured data

Browser connection
API Connection

Improve, simplify and increase productivity of data workflows with machine learning technology.

The ParseXtract API allows you to train and extract data in PDF document and transform it in a JSON structured format.

USERData ownerYOUR APPProcess or Solution.DOCImage basedPDFPXSecuribox ParseXtractJSONStructured extracted data.CLASSIFICATIONIdentify and group files into homogeneous collectionsEXTRACTIONExtract, validate and format document dataUSERData ownerYOUR APPProcess or Solution.DOCImage basedPDFPXSecuribox ParseXtractJSONStructured extracted data.CLASSIFICATIONIdentify and group files into homogeneous collectionsEXTRACTIONExtract, validate and format document dataUSERData ownerYOUR APPProcess or Solution.DOCImage basedPDFPXSecuribox ParseXtractJSONStructured extracted data.CLASSIFICATIONIdentify and group files into homogeneous collectionsEXTRACTIONExtract, validate and format document dataUSERData ownerYOUR APPProcess or Solution.DOCImage basedPDFPXSecuribox ParseXtractJSONStructured extracted data.CLASSIFICATIONIdentify and group files into homogeneous collectionsEXTRACTIONExtract, validate and format document data
Models

Pre-trained

PX can be integrated, easily and quickly, to extract data from your documents.

Invoices, bank statements and payslips have already been modeled and pre-trained.

Output
{
    "detailedLabelId": "3f18d4a6bb6979ea3e9f7bce6ac61abc",
    "extractedData": [
        {
            "name": "Invoice.Type.Identifier",
            "value": "Invoice"
        },
        {
            "name": "Invoice.Date",
            "value": "18/09/2019"
        },
        {
            "name": "Invoice.Number.Identifier",
            "value": "2234567"
        },
        {
            "name": "Supplier.Name.Literal",
            "value": "Ma Société SARL"
        },
        {
            "name": "Supplier.National.Identifier",
            "value": "000000000000"
        },
        {
            "name": "Supplier.Siret.Identifier",
            "value": "554 874 445"
        },
        {
            "name": "Supplier.Vatnumber.Identifier",
            "value": "FR 000000000000"
        },
        {
            "name": "Invoice.Currency",
            "value": "EUR"
        },
        {
            "name": "Invoice.TotalAmount.WithoutTaxes.Amount",
            "value": "276,00"
        },
        {
            "name": "Invoice.VATTotal.Amount",
            "value": "55,20"
        },
        {
            "name": "Invoice.TotalAmount.WithTaxes.Amount",
            "value": "331,20"
        },
        {
            "name": "Customer.Contact.Name.Literal",
            "value": "Pénélope D. Seguin"
        },
        {
            "name": "Customer.VATNumber.Identifier",
            "value": ""
        },
        {
            "name": "Customer.Address.Line1",
            "value": "51 rue Nationale"
        },
        {
            "name": "Customer.Address.ZipCode",
            "value": "75003"
        },
        {
            "name": "Customer.Address.City",
            "value": "Paris"
        }
    ],
    "id": "DemoTrial_20100831_Armstrong_Neil_0014.pdf",
    "labelId": "FactureMaSociete"
}
Output
{
    "detailedLabelId": "3f18d4a6bb6979ea3e9f7bce6ac61abc",
    "extractedData": [
        {
            "name": "Employee.Identifier",
            "value": "078904"
        },
        {
            "name": "Employee.Full.Name",
            "value": "Pénélope D Séguin"
        },
        {
            "name": "Employee.SocialSecurityNumber",
            "value": "2651132254647 79"
        },
        {
            "name": "Employee.Address.Line1",
            "value": "51 rue Nationale"
        },
        {
            "name": "Employee.Address.ZipCode",
            "value": "75003"
        },
        {
            "name": "Employee.Address.City.Name",
            "value": "Paris"
        },
        {
            "name": "Payslip.StartDate",
            "value": "01/08/2019"
        },
        {
            "name": "Payslip.EndDate",
            "value": "31/08/2019"
        },
        {
            "name": "Company.Name",
            "value": "Ma Société SARL"
        },
        {
            "name": "Company.SIRET.Identifier",
            "value": "55487445"
        }
    ],
    "id": "Bulletin_de_paie.pdf",
    "labelId": "MaSociete_label"
}
Output
c
Service

Discover our approach, evolve your business!

We do it our own way: based on our family of unsupervised classifiers, document query language and query generator engine.

— View documentation
chevron_right

Divide et impera

The use of several uncorrelated unsupervised classifiers allow us to group similar documents together.

For instance, we are able to recognize a trademark in the document's header or a recurrent paragraph in the footer.Once the documents are grouped into the correct homogeneous collections, finding the right extraction rules is easier.

chevron_right

Whitebox

We have developed our own query language (PQL) that allows us to navigate the layout structure of the document, jumping to a specific point and use regex selectors.

Machine learning techniques are used to automatically generate the extraction rules.As these queries are human-readable, we can always correct or improve them in case of overfit or other issues.


Security,
always the priority.

Our website and application traffic run entirely over encrypted SSL and HTTP strict transport security to ensure that browsers interact with Securibox exclusively over HTTPS, meaning that credentials and other sensitive data is never leaked over the network.

“When accessing our application, along with each request, a unique token is sent thus protecting against Cross Site Request Forgery (CSRF). All the sensitive data stored within our servers is encrypted with AES 256-bit and rotating keys, so that the way the encryption is constantly changing."

Get in touch.

Looking for more information? We’re always available *.

* This form allows you to contact Securibox for any general question! You can access, obtain a copy of the data concerning you, oppose the processing of this data, have them rectified and erased as well as limit their processing. The data sent by this form may be transferred outside Europe, in compliance with the GDPR. **