Automating PDF Accessibility at Scale with Equidox AI
Equidox AI automates PDF accessibility and simplifies the process of digital compliance to a few clicks. Learn how this solution can result in substantial reductions in time and costs.
Equidox AI automates PDF accessibility and simplifies the process of digital compliance to a few clicks. Learn how this solution can result in substantial reductions in time and costs.
[Paul Campbell] Good afternoon, everyone. Welcome to our webinar for Equidox AI, which is a fully automated PDF remediation solution. We are very excited about this cutting-edge technology that is solving PDF remediation challenges at scale. By way of introduction, my name is Paul Campbell and I will be joined today by Dan Tuleta for the next 30 minutes. Quick logistics, if you have questions during the webinar please drop them in the Q&A chat button at the bottom of the screen and we'll get back to you with an answer. Additionally, this webinar will be recorded and will be sent after the meeting in addition to the slide deck and also a short survey, which we'd appreciate you filling out. We're also happy to do a more direct interactive session with you and other team members if they were not available to join and further our conversations as to how Equidox AI may be a fit for your organization specifically. Overview of the agenda today: first we're going to talk about who is Equidox Software Company, where we have been and where we are going, challenges we have seen in the market, and our solution to the problem. Then, Dan is going to talk about why we make PDFs accessible, what's driving this, why should people make PDF accessible, followed by an overview of Equidox AI how it works, and finally, a demonstration of our new bleeding-edge, powerful technology. So a little bit about Equidox: we've been around in existence for nearly a decade. Way back when, a Canadian citizen was trying to apply for a government job posting on the internet but she was unfortunately unable to do so because of her visual disability. She sued the government and won her case. The government sought out a solution at that point but they couldn't find one so they asked the marketplace to respond and that's hence why we as Equidox started to build a solution. And we really haven't stopped innovating since. We created a robust software as a service solution and now hundreds of customers currently use our SaaS product. The solution is world class and is adding tremendous value to the marketplace of digital accessibility for Enterprise organizations, government organizations, and educational institutions. Our customers love the product, evidenced by the fact that nearly 100% of our customers renew their subscription every year. That is all well and good, however, we started hearing from organizations in the last couple years that had tens of thousands, hundreds of thousands, or even millions of documents that needed to be remediated and there simply wasn't an automated solution for that daunting need. The traditional service providers were sending the documents to India and other countries but there was no efficient way to scale so companies, unfortunately, were forced to settle for a solution that is time consuming, expensive, and doesn't truly mitigate the risk of a lawsuit from what we found. So enter Equidox AI and why we figured we wanted to solve this problem in the market. Equidox AI is a fully automated PDF remediation solution that removes the traditional, manual remediation methods and auto-tagging methods while increasing quality, accuracy, and compliance. Equidox AI is utilized for use cases where there are templated, recurring, large volumes of documents where manual remediation methods are just too cumbersome and daunting to address. There are three main challenges when it comes to PDF remediation that we have found. Number one: costs. Companies have multiple vendors. Outsource providers and the and the investment internally for personnel is very costly can really be a runaway train of cost because of the industry standard price per page can be exponential and the manual work to this capacity and scale is extremely expensive. Quality, number two. Because of the cumbersome, manual processes discussed, auto-tagging mishaps and and other issues multiplied by the volumes of pages in scope and complexity leaves organizations exposed to non-compliance and potential lawsuits because of these elements. And the last one is speed. Because of the demanding legal requirements and quick turnaround times to get accessible information to customers or employees consistently, it's not realistic to accommodate with traditional, manual processes and the traditional auto tagging methods because of the sheer volumes to manage when coupled with that quality. So we obviously wanted to create a better way to solve for these challenges, and we have. Our experts have found a way to truly automate the PDF accessibility process for many use cases involving high volumes of documentation in a reoccurring way. Equidox AI automation allows, number one, for good quality usability and compliance every time because of our unique model creation using machine learning and artificial intelligence. However, we don't autotag, or cut corners, or rely on any human element to get the process done and not only will pass a checker but will be fully usable and accessible to anyone with a screen reader, which is very near and dear to our hearts as a remediation organization. Equidox AI automation also accommodates aggressive timelines. Because we're relying on the technology, we can dictate how fast the solution runs and turn the dial up and down, so to speak, to accommodate timelines that may be required because we have flexibility in the infrastructure and the compute that we apply to to satisfy those timelines. And then lastly, Equidox AI automation allows for lower costs and process improvements and vendor consolidation where you don't have to rely on multiple vendors doing a common task. You can kind of have one organization to produce this automated solution for your high volume, templates, templated documents. So with that said, now I'm going to turn it over to my colleague, Dan, to talk about why it's important to make PDFs accessible and how Equidox AI really works and, finally, a demonstration. Dan? [Dan Tuleta] Great, thank you, Paul. Hi, everyone. So yeah, let's talk a little bit about why we are making PDFs accessible. So I assume that most people on this call are at least somewhat familiar with accessibility laws like the Americans with Disabilities Act or Section 508 of The Rehabilitation Act. I am not a lawyer so I'm not going to go into the details of all of these laws but just at a high level, there are requirements for organizations to provide physical access like wheelchair ramps, elevator, or elevators, Braille signage, and organizations need to ensure that their public facing digital content, including PDFs, is accessible to everyone just the same as they are to make sure that there's physical access to their buildings and facilities. So ignoring the accessibility of your digital content opens up your organization to legal risks. There have been thousands of organizations who learn this the hard way when they've been sued for exactly this type of problem, and there are thousands more who quietly pay large settlements kind of quietly and then they still have to ultimately go back and fix their accessibility deficiencies. So long story short, we live in a very digital world and we rely so heavily on digital information. So digital accessibility is not a fad and it's not going away, so it's always good to be aware of it and and address it in a proactive way. So for anyone who is unsure of why any of this digital accessibility stuff matters in the real world, people with disabilities use various types of assistive technologies to interact with digital content like PDFs. A very common type of assistive technology is called a screen reader, which is capable of reading digital content like websites, applications, and documents. Screen readers use digital tags to navigate documents, and these tags need to be properly encoded into the document to organize the content and make it compatible with the screen reader. So think of the tags as a framework of the document which gives the screen reader the ability to navigate and interact with all of the various elements in the PDF. Equidox, in cooperation with the National Federation of the Blind, surveyed about 250 blind or low vision individuals who rely on screen readers to interact with their PDFs on a daily basis. Based on this survey, we found that at least two-thirds of PDF documents are inaccessible to people with disabilities. So if you put yourself in the shoes of a blind person, you can quickly imagine how frustrated you would be if you could not read two-thirds of the documents that you came into contact with on a daily basis. On top of that, imagine the potential privacy issues that there would be if you had to ask your neighbor or your friend to help you read private documents like banking or investment statements, an invoice, a pay stub, or insurance policy documents. So just to further emphasize the points that I was making a couple of slides back, here is just some additional information about the volume and types of lawsuits that organizations have faced and will continue to face moving forward, and just to reiterate the digital accessibility requirements that organizations must adhere to. They are not going away and there will continue to be an increase in the attention that is paid to it by state and federal mandates, the Department of Justice, disability advocacy groups, and individuals who simply want to just be able to access their critical information. So one of the main challenges around PDF accessibility is that PDF documents, each one of them, is unique. We have heard a lot of empty promises over the years of fully automating PDF accessibility, but there are so many things about PDFs that require human interpretation to decide how to tag specific elements within the content. I have been working in the PDF accessibility market for over seven years and I have seen a lot of organizations assume that they have accessible documents because their documents have some tags in them. But they quickly learn that they are not usable, nor are they compliant, and they are still open to litigation. So I always tell people to beware of quote unquote auto tagging technology masked as a solution to fully automate PDF accessibility. These auto taggers that are floating around out there, they're capable of putting tags on the page but there will always be accuracy issues and the inaccuracy of these tags will lead to a lot of confusion and frustration for the screen reader user. Additionally, auto taggers can and will leave organizations open to further litigation because there is no guarantee of compliance with WCAG standards. So even paying to outsource your huge batches of documents to auto taggers, you're still not mitigating your risk of litigation because auto-tagging falls well short of true compliance with accessibility standards. And then, of course, the alternative of outsourcing the remediation work to third parties who are almost exclusively located overseas introduces a mountain of data privacy issues, and even if you can work around that with your use case, the sheer volume is impossible to keep up with. These outsourced remediation providers will cut corners to do the bare minimum of work that they need to make a document pass an automated checker, but they're not actually making the document compliant because it simply takes too long to meet the deadlines at that type of volume. So incorporating artificial intelligence, more specifically computer vision and machine learning, into high volume PDF remediation, this allows our accessibility experts to train AI models to accurately identify and tag all of the elements in the document template. The use of AI developed by our data scientists paired with the human element of our trained accessibility experts allows for incredibly accurate usable and compliant PDFs to be returned to the customer in a fraction of the time because AI works exponentially faster than humans manually tagging each page. AI doesn't need to take vacations. AI can work 24/7/365 without breaks, and AI doesn't need to cut corners to meet a deadline. It can just do it the right way. So how does AI work? Our accessibility experts use example documents of customer templates to properly identify the various elements on the page. These elements might include text and paragraph structure, various levels of headings, lists and tables, graphs and images, and of course, the very important reading order of the content. This training data that we accumulate is then fed to the AI models to apply what it has learned en masse to many thousands or even millions of pages that have similar templates and formatting. Although the mechanics of how AI technology works is rather abstract and more complex than what I'm capable of showing you here in a simple PowerPoint slide. But here are a few examples of how we can visualize the AI at work. So for example, in this scatter plot each each of the green dots represents a page in a PDF. They are grouped together by the AI based on similarities that the computer vision finds. So this cluster will contain all of the pages that contain pie charts, for example. In this example, you can see there are different multicolumn text layouts that the AI will use to recognize different pages and group them together appropriately so that it can apply the correct tag structure. The AI will pick up on font styles, sizes, and colors to help it establish the tags on the page. We can even train AI models to identify the many potential variations in tables, such as the numbers of columns and rows, table headers versus table data, and even tables of different sizes that might span across multiple pages. We'll talk a little bit about this when we get into the demo as well. The results of this extensive document analysis and feeding the training data to the AI is we're creating fully compliant PDF documents without any human remediators who are, again, expensive to employ or outsource to, and they are, of course, liable to make human errors or be forced to cut corners just to meet unattainable, unrealistic, deadlines due to the crazy volume demands. We are also reaching full compliance because this is not auto-tagging, just sloppily throwing tags on a page and saying that it's good enough. Beyond compliance and passing automated checkers, the bonus of using AI for high volume and hyper-fast remediation is that it will produce incredibly accurate and very much usable documents for people with disabilities. So your customers who rely on you on assistive technology, they're not going to be filing complaints or lawsuits or calling your headquarters to complain that their document that you've given them cannot be navigated or understood because they're using a screen reader. So we are just about ready to jump into a demo. I promise the slides will end soon so. But before we do I just want to make it clear that the the underlying technology that we're talking about here, this can be deployed in several ways to align with your organization's requirements. So first and foremost, what we're going to be seeing during the demo is we've built an interface that allows us internally here, and potentially you if it's the right type of use case, for this interface to run the process from start to finish. So basically doing bulk uploads of documents, running the batch process, and then downloading the finished PDF. We can also take the technology and use, and this is kind of like what we ultimately envision for this technology, we can embed the AI models into an existing document creation and delivery system through the use of APIs. So this would be probably critical for customers needing to download their private documents like a monthly statement, or an explanation of benefits, or medical test results, or investment portfolio type of reports and status updates. Those types of documents that are produced en masse but contain private and sensitive information. Lastly, Equidox can operate the entire process on your behalf as a managed service. So we can take care of the remediation as well as the validation to ensure that everything exceeds all accessibility requirements, and then we would deliver fully compliant, fully compliant PDFs back to your organization to then be posted and distributed. So we'll talk a little bit about that in one of the demonstrations as well. Just one more thing to note, Equidox AI is tagging the PDFs at what we call the post-processing stage. So, and you'll see this during the demonstration, and what I mean by that is these PDFs have already been created and we are applying the accessibility as a final step before they are publicly distributed. The advantage of of tagging PDFs post-processing is that we do not have to disrupt or completely rebuild your document creation process, which is probably fully established and has been sort of vetted out by your organization over a long period of time, and it wouldn't be ideal to have to completely redo that from scratch. So your designers and your producers of mass documentation can continue their process the way that they've been doing it, and we will handle the accessibility component at the very end of the creation stage, but right before the document reaches your customer. Okay, so what we'll do, we're going to jump into the demonstration and so I'm going to leave the slide deck for just a minute and I'm going to switch over to our batch interface. So again, this is an interface that we have built pretty much just for demonstrations to help people visualize like what the technology is actually doing. But again, this technology can be deployed in a number of ways to kind of align with your specific use case and any internal requirements like around security or integration that your that your company or your organization would have. So what we'll do to get started is I'm going to go to the upload documents tab here on the interface and then I'm just going to open up the folders on my hard drive. What I'm going to do first is I'm going to grab a batch of financial statements. So this is just a simple .zip folder that contains, I can't even remember, 20 or 30 sample financial statements. So we're just going to use this for sort of a small scale example. If I drag and drop that batch of documents into this, I can then press the upload button. So I'm just going to give it a few seconds to upload and once it uploads, it's going to be available to have the AI models be applied to those various documents. So if I now go to the Create and Run Batch tab, I have a dropdown menu to select. I have some different models that are kind of pre loaded here into my own little private demo account. So one of the models is called “Example Statement” so we're going to use this model to apply to those documents. Now I just have to select the .zip radio button, and I'm going to choose the financial statement .zip folder that I just uploaded, and then I just press Run Batch. Now this is going to kick off an automated process where Equidox is going to first unpack that .zip folder and it's going to identify all the various elements within these different statements. Now these statements are all relatively similar to each other but they can have quite a few variances. So just think about, like, what your credit card bill might look like. You might have a credit card bill one month that has just a single page because you only used it a couple of times, you only have a couple of charges. Then you might have another month where maybe it's holiday shopping season and you have 200 charges on that credit card over the course of the month, and then suddenly your bank statement or your credit card bill is a lot different looking. It's got three, four, five pages breaking down every single one of those charges, usually in a table format. So these are just some of the examples of like where you can have differences even though the documents are similar and are coming from really the same source. So while I was talking there you might have noticed these green lights just kind of lighting up across the screen. Equidox, after it unpacks the .zip folder, it will start applying the machine learning zones based on what it knows about this template. Once it finishes with the ML zones, it's going to run this export process and we can see these green status bars again lighting up. And then once the documents are finished, we get a Job Finish and a Job Success green light and all of these documents are again available for download. I can also I see some basic information up here, like how much time was elapsed for that process to run, how many documents ran, how many total pages. We're not too concerned about that right now, but we're really just going to look at kind of the resulting PDF. So if, before we get into the completed one, if I just like unzip this for a second and let's just look at one of these documents that we started with. These documents were completely untagged so there's no tag structure at all. This would be just a completely useless page to someone who was blind. They would not be able to read any of this information. They would not be able to understand their deposits and credits and withdrawals. All of this information would be completely lost on them because this document is not tagged at all, or if it were tagged it was probably not tagged properly. So what we do through that AI process is, if we just take a look at one of these documents that came out of the batch, and I'll download this and I will put this on my desktop just so we know it's the different one, and I'll open up my document that I just created. Now this document here, if you can tell underneath the accessibility tags tab, this is completely different. We have all of the elements accounted for on the page. But not only are they accounted for, they are accounted for correctly. So we have your bank name and your customer name information up here, we have a figure which would be, like, let's just say the logo of the bank in this in this example that we're using. So we can navigate through all of the different content. We have our heading structure, we have an H2, we have a table. The table is properly tagged, and if you're not too familiar with what the tag structures look like in accessible PDFs, this is kind of the whole point, that it is pretty complicated and it's very slow and manual to set this up document by document. So the use of automation and AI dramatically simplifies this process and it totally takes humans out of the equation. So because of that training work that we did on templates like this, our AI is able to fully understand and recognize the differences between these different elements and account for them in the tag tree in a fully automated way. Another use case that I can quickly explain would be a document. It's going to be a totally different document. Let's go here and I'll use this document. If I upload this one, this document here, if we'll take a quick peek at it, this is a totally different use case. So we were just looking at a bank statement, which is kind of applicable to invoices or a pay stub or test results from on the healthcare side of things. There's a lot of different use cases that would use things like statements, but this is an example here where we have what's a a listing of a physicians directory. These are thousands of pages long and they go on and on and on and they are consistently updated. So they're updated on a regular basis and that would require the user to go back to the the document and re-remediate it month after month, or quarter after quarter, or year after year. On top of that, those documents are also created in many different languages, so depending on the market that that the the physician's directory is located in, they're typically being produced in multiple languages. At least English and Spanish, sometimes Chinese, sometimes Arabic, it just depends on the the region of the country. But those physician’s directories, the volume of them is impossible to deal with, and they're actually quite complicated documents. You have very complicated heading structures throughout, you need to of course make pages like this, and there's all kinds of different things that span across multiple pages that needs to be accounted for. There's just simply no way for humans to be able to remediate these at the volume of literally thousands and thousands of pages that are constantly being updated on a monthly basis. So this is a use case that we're currently solving for. We have customers that are dealing with, like I said, literally millions of pages just like that across their across their different networks. And so we have to, we're accounting for that through the use of AI because they're dissatisfied with the results they were seeing from both auto-taggers as well as outsourced human remediators. The accuracy was terrible, the speed was too slow, and it simply just wasn't good enough for what they needed. So we're letting the machine learning kind of do its thing here. This is a long document and there's a lot of tags for it to kind of interpret and apply. But what we'll see when we export the document is very similar to what we saw when we exported the bank statement. We are going to have accurate tags where it's properly accounting for the reading order, properly accounting for the heading structure, all of the different text elements are going to be identified. My little private environment seems to be a little bit sluggish today but we'll get there. Once it gets into the export it should go rather quickly. Okay, so with that said, just in the interest of time I realized that we're at 2:30 and people might have to be dropping off. So while while we're waiting for that, oh, there we go. Perfect timing. It just finished. So if I open up this document now, and again I'll open it up in Adobe Acrobat, we remember that the original was not tagged at all. But I just wanted to show you that this one is properly tagged. So when you open up the tag structure you can see that all of these different elements are accounted for and they're accounted for an accurate way, which is critical because this is an extremely difficult document to use if you're blind and these elements are not tagged correctly. Just imagine if it reads left to right across the three columns you would have no idea what any of these doctors are, where they're located, would be impossible to use this. So that's what Equidox solves for. Now with that said, I'm going to jump out of the demonstration and we'll go back into the slide deck just to wrap things up. We do have some articles that will be listed in the slide deck, so when we share the slide deck out if you'd like to learn a little bit more about PDF accessibility please feel free to browse through them or visit our website. And just in conclusion, I just want to say thank you to everyone for joining us here today. So we hope you see the value and the capabilities of the new technology. Please do not hesitate to reach out to one of us for more of a one-on-one consultation so that we can discuss your organization's unique use cases and how Equidox AI can be applied to them. And again we will be sending out the recording of this webinar so please feel free to share this with anyone in your organization. We will include a link to the slide deck and for anyone who asked a question during the Q&A feature, we will get back to you as soon as possible. And again there will be a short survey so if you don't mind just taking a moment to fill that out we would greatly appreciate it. So thank you again everyone for joining and have a great rest of your day. For more information about how Equidox Software Company can help you with PDF accessibility, email us at EquidoxSales@equidox.co or give us a call at 216-529-3030 or visit our website at www.Equidox.co.
See how Equidox AI automates PDF accessibility and simplifies the process of digital compliance to a few clicks. Learn how this solution can substantially reduce time and cost compared to your current solution.
Speak with an expert to learn how Equidox solutions make PDF accessibility easy.