Mastering PDF Accessibility
Why you need to make PDFs accessible, a PDF remediation project plan, and document workflows.
Why you need to make PDFs accessible, a PDF remediation project plan, and document workflows.
[Dan Tuleta] Okay, well, my clock is showing two o'clock, so I think it's a good time to get started. welcome, everyone. My name is Dan Tuleta. I'm the Senior Sales Engineer here over at Equidox. I appreciate everyone joining us today for another edition of our Equidox Webinar Wednesdays. Today, we're going to be talking about mastering PDF accessibility, so let's get into it. just quickly through the agenda here of what we're going to be covering today during the webinar. First, we'll obviously introduce who we are as Equidox. Then we'll talk a little bit about why we make PDFs accessible, how to put together a plan for PDF accessibility, and then some of the actual mechanics of the PDF remediation workflow. And then I'll be finishing things off with an Equidox demo. So, anyone that has never seen our software in action before, you'll be able to get a peek at kind of what Equidox looks like from a document remediation workflow perspective. So it's a little bit different and obviously better and faster than Adobe Acrobat for, for a similar process of making PDF documents accessible. Okay, so first and foremost, Equidox Software Company. Our mission statement is to enable PDF accessibility through intelligent automated solutions. We are an accessibility company with a niche in the PDF document space, so we are a leader in PDF accessibility. We're probably most known for our flagship Equidox Software as a service, so it's a web-based application for making PDF documents accessible. We also offer an AI solution for specific types of use cases, which we're not going to really get into in this, webinar. But, if you have any use cases in mind where you have high volume, sort of repetitive types of documents that follow a similar formatting and structure, please let us know. We definitely would love to chat with you about that if you have any use cases in mind. Okay, so let's talk about some of the market challenges. PDFs are really one of the most difficult document types to make compliant with the Americans with Disabilities Act and all the standards that fall under it—Section 508 and WCAG guidelines. Equidox makes PDF accessibility easier and faster. We also put a real emphasis on quality because Equidox is much easier to use than a competitive tool like Adobe Acrobat. It allows you to get through a larger volume of pages and documents. It's easier to use, so you can put the power of the tool into a much larger group of users. You no longer have to silo this to just a handful of resident experts in your organization. You can really have everyone being more responsible for the content that they create. Because you can go faster and you can deploy it to a broader group of users, it really allows you to put more of an emphasis on the quality. One of the problems with PDFs is that organizations have so many of them because they're being produced on a daily basis, and in many cases, they've been being produced for decades. Really, no one has ever addressed it from an accessibility standpoint. There are massive backlogs of documents on top of the documents that are being produced on a day-to-day or week-to-week basis. It's overwhelming for organizations, and that leads them to having to outsource it overseas, where you get very sloppy work that's being done to just make it pass a checker. In some cases, if you're doing it internally, people just end up having to cut corners because they're simply overwhelmed with the volume of content that they have to work through. They don't have the manpower; they don't have the time in the day to work through all of the backlogs and all of the constant flow of documents that are coming across their desks. Our ease of use and the speed that Equidox allows you to remediate at allow you to really put an emphasis on the quality to make sure that all of the documents that you are publishing and putting in a public-facing place on your website or in any other sort of applications or LMS or content management systems, you're always going to be able to make sure that those documents are not only compliant but also usable for people that are using assistive technologies. So, why do we make PDFs accessible? Well, mainly you have to. It's part of the Americans with Disabilities Act and Section 508 of the Rehabilitation Act. Now, I'm not personally a lawyer, so I'm not going to explain all of the little finite details of the, minute details of the, of the laws. But we do have links in here that you're able to access if you'd like to read up on these different standards and guidelines and things that you should be doing around PDF documents. So recently, there was a final ruling on digital accessibility in 2024 under Title II of the ADA, and it mandated that WCAG 2.1 AA standards must be met for web content and mobile applications. This, of course, includes the PDF documents that are on your websites and applications. So, education institutions as well as state and local governments both fall under Title 2. so it really is something that you have to be aware of, and you have to start managing, and make sure you have a plan in place. There are certain exceptions, just to mention that underneath the Title II ruling, there are certain exceptions. but in most instances, PDFs still need to be made accessible. So, a general rule of thumb is that if the PDF document is still actively in use or relevant to people, if it's on your public-facing website, it likely needs to be made accessible. So, don't just assume that the exceptions always apply to you because, in general, they will not. Um, and the DOJ provided a fact sheet citing some of these examples where exceptions do not apply. And here will be a link for the, for all of those exceptions within that ruling. So just, to help quantify the situation, Equidox, in cooperation with the National Federation of the Blind, we did a survey with, several hundred blind and low-vision, people that use assistive technology on a day-to-day basis and what their experiences, are. What their experiences have told us is that at least two out of three PDFs are not accessible. So, 67% of documents that are floating around on the web that are in, you know, active daily circulation are not accessible. So, it presents a huge challenge to people that rely on assistive technology, that are trying to interact with those documents and extract the same data that a sighted user would be able to. Um, but they simply cannot, they cannot have that same experience because the documents are not accessible, they're not compliant, and they're not compatible with their assistive technology. So, therefore, they are left to find alternative ways to get that information, which leads to all kinds of challenges. Okay, so let's talk about creating a plan for how we go about attacking PDF accessibility. so there's definitely, you know, somewhat of an order of operations. Now, this can vary from organization to organization, and you know, depending on the size of your organization, depending on the size of the PDF problems that you have, this can obviously, vary from org to org. So, this is not exactly like a rigid set of eight things that you must do, but in general, a good order of operations would be to assign a staff. You obviously need personnel to help you address this problem. Um, you want to evaluate the scope. So, how big is this problem? Do we have documents that are 30 years old that we can, you know, archive and just completely wipe off of the website? Or do we have documents that always need to be posted online, and there's really no way that we could ever get rid of them? Uh, are we talking about a couple of hundred documents, a couple of thousand documents, even tens of thousands or millions of documents in some extreme examples? So, understanding how large of a problem this really is, is a good thing to do after you've assigned a staff to help you evaluate that. Putting a written plan in place is always a good thing to do. It's always good to show that you, as an organization, are making a commitment to this plan to address PDF accessibility. So, getting that written plan and distributing it within your organization, and, and start holding people, you know, responsible and making sure that everyone is following along with this plan. The next thing you want to do is prioritize the documents. and I'll talk a little bit about that on the next slide. but you know, it's important to make sure that you're addressing, you know, the most important things first and then the least important things last. So, we'll talk a little bit more about that on the next slide. Um, you, of course, will want to choose a tool or a vendor. so some organizations, depending on the situation and all of the variables, they might choose to outsource PDF remediation to a third party. So, hiring a company to do it for you. Um, that has its advantages and disadvantages, but that's one method of going. then you can also, of course, choose a tool. So, there are different PDF remediation tools out there, the primary one being Adobe Acrobat. So, Acrobat is kind of synonymous with PDFs; it's what the majority of people rely on. but there are alternatives, which we'll get into that as well. Here we are in an Equidox webinar, so, of course, we'll talk a little bit about an alternative tool for this. Then, of course, you need to remediate the documents, which we'll talk more about, and I'll show you how we go about remediating PDFs. Then, of course, validate the documents, just making sure that everything that you did during that remediation process is truly making the document compliant and usable. And then, of course, everything takes constant maintenance, right? It's not like you just get accessible and you're done forever. As that website, as your applications, as your documents evolve over time, there's constantly new iterations of different things that are going up onto the website. So, you just want to make sure that once you've gotten yourself caught up as an organization, that you stay caught up. Don't get behind the eight ball again. It requires that constant maintenance to make sure that you are, in fact, staying compliant. Okay, so addressing PDF remediation projects. one thing that's a good, you know, first step, kind of, is to remove the outdated or unnecessary files to help shrink down that pile of documents that you need to remediate. You know, we've helped organizations, you know, do some, like, sort of evaluation of their documents in the past, and it's pretty interesting what you'll find floating around on websites. You might have, a, you know, an invitation to the company picnic from 1998 still posted on the website or something like that. Do these types of documents really need to, like, still be living online? You know, old newsletters, old memos that are completely obsolete. Um, there's a good chance that you have many documents that are sitting on your website presenting a compliance, sort of, issue for, you know, litigation, future litigation, that doesn't actually need to be posted online anymore. So, it's a good thing to just kind of evaluate your site and everything that's posted publicly and remove the stuff that's not necessary. Uh, you also want to, evaluate the number of PDFs that need to be remediated and their complexity. So, that's a good thing to kind of understand, like, how complicated are these documents to remediate? Is it just simple text on a page? Uh, are there more, more, more challenging documents like fillable forms? Do we have OCR content, stuff that is just, like, scanned in and turned into a PDF? Um, and then also decide what will be done in-house and what will be outsourced. So, if you do decide to go with, like, a mixed approach of outsourcing some and doing some in-house or vice versa, you can, you know, make sure that you have your handle around, like, what exactly you want to ship out overseas versus what you would like to just keep in-house. Okay, and I also mentioned we would talk a little bit about how to go about prioritizing documents. Um, just at a high level, our general recommendation would be to start with the content that is the most frequently used. So, if you have, let's say, a document that is, like, just front and center on your website, that when someone comes to your website, there's a good chance they're going to open that document up. Um, you really want to make sure that that PDF is made accessible because that's going to impact the most amount of people. Um, the most recent content. So, if the stuff that's the newest, the latest and greatest, you, of course, want to put that as kind of, like, a high-priority document to, to make accessible because it's the most up-to-date content on your site. And then from there, you would probably work towards the lowest complexity. So, what I mean by that is the easiest stuff to remediate. The reason I would, I would put that rather, like, as the third element on the list here, in, of, like, putting it towards the end is, because you, you want to be able to get through the largest amount of pages possible, in the shortest amount of time because that, again, is going to be more impactful. And then everything else, that would be your old documents, your really, your really complicated documents that are going to be more time-consuming to remediate. I would put those towards the end because those are probably going to have the least impact on the majority of people. So, that's kind of our general, like, recommendation for how you go about approaching this. but of course, there's always, you know, variables that, that come into play for every unique organization. So, this is not exactly, like, the, a rigid set of rules, but this is our general recommendation and a good place to start when you're evaluating the scope of your project. Okay, so when I go through the actual demonstration today, you're going to hear me talk a lot about, a lot about tags. Um, and just if anyone is not aware of what tags are or how they work, all elements on a page require a digital identifier known as a tag, to be read by assistive technology. So, different elements on different PDFs are going to be tagged in different ways. Some of the primary tags that you'll see a lot of are things like text, images, headings, links, lists, tables, and of course, the reading order of all of the content on the page is very important as well. So, so these different tags are a way of organizing the information on the page to make sure that the screen reader is going to navigate this document and read the content in the same way that a sighted user would read this page. So, there are some general pre-flight document observations that are good to take into account. Um, first of all, how many pages are there in this document? Is it a simple one-page document, or is it a 500-page textbook? there's a difference in how you might go about approaching that document project depending on the size of it. How complex is it? Is it a very difficult fillable form with 500, text input fields that need to have tooltips written for them, or is it a simple, just a couple of paragraphs on the page where it's very simple to, to go about tagging? Is the design and the formatting of the document consistent throughout, or is it some sort of, like, Frankenstein document which was, you know, brought together from three different sources and it was all just strung together as one PDF document? Uh, is there existing tag structure? So, if the tags are there, are they worth keeping, or are they tags that really you ought to just get rid of and, and kind of throw away and start over from scratch? It's always case by case with PDFs. You don't always know. Uh, does the document require OCR? So, OCR stands for Optical Character Recognition. What that means is, is the document just an image? And that might be common if you have, like, a scanned document, for example. Not all PDFs are created equally. Sometimes a PDF is simply just a scan of a page. So, as far as a machine is concerned, it's just an image, and within that image, there might be text that's readable to a sighted person, but you need to use a process called OCR to extract that text from the image so that it's actually readable and usable by someone that's using a screen reader. Um, are there form fields? So, fillable forms, they take a little bit of extra work and, and sort of, you know, TLC to make accessible. that's just the nature of a fillable form. But, understanding if the document is a form or just a standard, like, plain PDF is, is important to know. And also, are there images? Are the images informative and, like, technical? You know, do you have diagrams and charts and things that are providing, like, technical information about the document, or are they just decorative? You know, is it just repetitive logos? Is it, just, you know, black and blue and red background colors that are just there in the form of an image, but it's really just there for, like, the visual aesthetics of the document? It's, again, always case by case, and these are things to kind of take into consideration when you're looking at a document that you're about to, about to go through the remediation of. Um, and then the workflow for PDF remediation. Uh, in general, I would recommend tagging all of the text. as a general rule of thumb, there are certain exceptions to that, but tag all of the text. Set all of the headings. so there can only be one heading level one, for example, and we'll talk a little bit more about heading structure when we get into the demonstration. Add alt text to the images and artifact that, any of the images that are not needed. So, again, using, like, the example of, like, repetitive logos or just, you know, background colors that are there for decoration purposes, if those are images in the document, you can artifact them so that the screen reader does not have to read them, especially when it's something that's just repetitive. Uh, tagging the lists, tagging the tables, making sure that all of the hyperlinks are going to be going to the correct URL or the correct destination wherever they're bound for. Uh, set the reading order for each page. That's, of course, critical. Even if the tags are accurate, the tags must be read in the correct order, or else it can render the entire page useless. Imagine a three-column article where the screen reader is just reading clear across the three columns. So, you're going to hear just a bunch of fragmented sentences and words that together will not make any sense. So, even if the tags are in the proper spot, it would render the entire page useless because it's all being read out of order. Uh, and then, of course, validate your work, and we'll talk a little bit more on how to do that. And we're actually linking to a workflow blog, in this slide deck, which we will share out with you after the presentation. Okay, so here is the slide that we've arrived at where we will embed the link to this demonstration. Uh, but I will jump out of here, out of my PowerPoint, and I will jump into Equidox. So, here I am inside of Equidox, and, just at a high level here, we have the Equidox. We're operating in a browser, so it's, something to mention that Equidox is not a software that you need to install or update or dedicate a license to on any individual machine. You can truly work from anywhere as long as you have an internet connection. and we also work with a concurrent user licensing model, so it's important to understand, like, kind of how the licensing works. Let's just say your organization has 10 concurrent users. That means any 10 people within your organization could use Equidox simultaneously. Maybe you have 20 or 30 or 50 people that can access Equidox; it's just a matter of only 10 at a time. So, you can deploy it to a much bigger group of users without having to dedicate an annual subscription to every potential user who might only need to log in for 10 minutes once a week. Um, so just something to consider. It's a different approach than, than how we're doing things, like with Adobe Acrobat, where every person needs to have a license of that software installed on their individual computer. And if you don't have that license installed, then you just can't use Acrobat. Okay, so what I'll do here is I'm going to go into this document. I imported it just before we got started. Um, I'm going to use this just to kind of walk through, like, the basics of document remediation. Um, inside of the document here, I can see that I have a thumbnail for the one single page, and if I click on that thumbnail, it will take me into the remediation page. And inside of the remediation page here, the first thing that I'm struck by are these yellow rectangles. And so, these yellow rectangles that I have here, what they are representing are essentially the tags. So, just so everyone is aware, this document was completely untagged to begin with. So, if I were to open this up inside of, like, Adobe Acrobat, it's a completely untagged document. That means a screen reader wouldn't know what to do with any of this information on this page. It's not accessible in any way, shape, or form because it's completely untagged. Now, when I imported it into Equidox, Equidox is smart enough to recognize that there's a bunch of content on this page, and it is putting these reading zones into the places where it thinks makes the most sense. So, it's effectively, like, auto-tagging it just by importing the document. However, we would say that you need to take it much further than just auto-tagging it because every PDF document is unique, right? Like, so this PDF document is different than the next one that I'm going to work on and the one after that. So, it's not just like a universal set of rules that we can just apply an auto-tagging process to and hope for the best. There's always nuance and details that we need to consider in each individual PDF document. So, we don't want to just import and export and cross our fingers and hope that it worked perfectly. We definitely need to go through this document and make sure that all of the different elements are accounted for. Now, one button that's really important inside of Equidox is this button right here, which looks kind of like a computer monitor. When you press this button, it will open up a separate tab in your browser, and in this browser preview, you can see an HTML rendering of the page that you are currently working on. Now, the reason that this HTML is really useful is because this is essentially a representation of how a screen reader would read this page if we were to just stop working on it and export it as it currently sits. So, it doesn't really take much of a trained eye to see that there are some pretty glaring issues here. Sure, like, most of the paragraphs and everything are going to be read out loud, and that's perfectly fine. But we have images that we need to address. We have a list right here that's currently not set up as a list at all. This is supposedly our table, so you can see, like, what a mess the table is. All of that data has no structure to it. You would be just giving a screen reader user a bunch of random numbers that would make no sense whatsoever. So, we need to make sure that we're addressing these types of issues. Now, if I go back to the PDF, just to call out another important feature within Equidox, and it's a feature that I actually don't really need to use on this document, but it's this sensitivity slider. When I move this slider back and forth, left and right, you'll see how these yellow rectangles kind of, like, rearrange themselves. And you can use that slider to kind of choose your best possible starting point, which I really already had when I arrived at the page. But if you want to move this around and see if it can do any better, it just gives you a chance to see if there's, like, a more optimal place to begin your work from where you'll have less work to do with the individual elements. So, just for example, like, if you were to bring this way over to the right, it takes kind of like a 10,000-foot view of the page. And if you were to then go to your preview, you'll see, like, it's kind of a problem here. This is, like, the wrong end of the spectrum. We don't want all of those zones to just be grouped together as one giant paragraph because that's equally as incorrect, just, you know, on the other end of the spectrum. And then, alternatively, if you bring it all the way down to zero, all of that content is going to be removed from the preview, and you're just left with the two images. So, again, you really just want to find kind of that sweet spot in the middle of the page where everything is going to give you a nice clean starting point. Um, we Inox can reduce that to, I don't know, how long does it take to hit L and nudge the slider? Maybe 5 seconds. So you can imagine how much, time you would save, you know, over the course of just even one document. Um, you know, you're saving yourself probably close to 15 minutes on this list right here. So you can see again that you have, like, the exact same structure in this list as you do from the PDF itself. This element here is our table, which right now is kind of a mess. You can probably tell we don't want individual zones covering up all of the different cells. That's what leads us to this terrible-looking structure here, where it's just a bunch of random numbers. But we can, of course, fix that. So what I'm going to do is I'm just going to put a zone right on top of that entire table by clicking and dragging. And then, if I hit T on my keyboard — T for table — I can then double-click on the table zone. And if I then look at just the table, now isolating it inside of the table editor, I can see these green grid lines here, which you're free to drag around if you like. But you can also use the table detector. So you're probably seeing the theme here of these very easy-to-use sliders. When you just nudge the sliders from left to right, it will wake up the artificial intelligence, and you can probably tell that these green grid lines are now in the perfect locations. So everything is in line with its row and column. If I go to the preview just after making that quick change, you'll see we've made a pretty dramatic difference. Instead of having whatever all this is, we actually have something that looks very similar to a table. I do have two little things that I need to correct, though, mainly the column headers here. If I just hold Shift on my keyboard while I select the cells, I can span across all of those cells. That way, I'm not duplicating numbers, and I'm making sure that 23 and 24 actually straddle the four sub-columns that they are the column header for. And also, this table has two levels of column headers, so I just want to hit my up arrow over here to change the default one column header to two. And if I were to go back to my preview, what I'll find is now the second row, which is represented by that bold font, is also going to be tagged as a table header. So all I have to do is save the table and close out of it. And all of those individual zones that I had before have been overridden by the table, so now I have just a single table zone on top of everything. All of the little individual things were removed that I had there before, the individual TCH zones. The images, not to spend too much time on these because these images are very simple. Just for this Equidox logo, I'm simply going to just call it the Equidox logo over here in the alt description field. And this is just really a similar rendering of that logo, but it has a dog in the picture. I would argue that this is just a decorative image, you know, it's there to just kind of take up space. So if I just simply hit, I can hit backspace on my keyboard to get rid of that image. So you can see the zone itself disappears, and if you go to the HTML preview again, you'll see that now we've kind of cleaned things up quite a bit. One thing that I'm noticing is that my footer has somehow jumped in front of my table. The reason why is when I added that table zone, it just naturally became the last zone on the page, which is what this little number indicates. So if I hit reorder and refresh the preview, you'll see now I have my table in the proper spot, and the footer comes at the very bottom. And so, just reordering the page, it will by default reorder it in a top-to-bottom way. You also have different options for multiple-column layouts as well, so it just depends on what the page is calling for. So now this document is fully accessible, it's fully readable, it's fully compliant. All I would need to do is go to the export tab and hit generate PDF, and it will produce this brand new document for me, which is going to be fully tagged from start to finish, just like I saw in my HTML preview. And if I were to open this up in my Acrobat application, I'll be able to show you the tags if Acrobat will wake up. Of course, it will sleep on me. Let's download it, I guess. So this is the new one that I just created, and you can see this document is fully tagged. Alright, so, excuse me, wrong X, want to hit this one. So back to the slide deck, I'm going to, we'll just wrap things up here just in the interest of time. Let's see, we are on this one here. So we have, we have a few links to relevant articles here, just kind of more information, more in-depth stuff based on what we were talking about here. And, that's going to conclude our webinar for today. So again, we are Equidox. You can reach out to us at EquidoxSales@Equidox.co; that's just like a general sales email address if you'd like to reach out. Our phone number is 216-529-3030. Of course, you can find us at www.Equidox.co, Facebook, any of your preferred platforms. And we are always happy to take individual calls if you'd like to just reach out and have a deeper discussion about your specific needs around PDF remediation or to see a more personalized demo, even using your own documents. We'd be happy to accommodate that, so please don't hesitate to reach out. So thank you very much, everyone. Have a great rest of your Wednesday afternoon. For more information about how Equidox Software Company can help you with PDF accessibility, email us at EquidoxSales@Equidox.co, or give us a call at 216-529-3030 or visit our website at www.Equidox.co.
PDF accessibility doesn’t have to be hard.
In this 30-minute webinar, learn how to approach large PDF remediation projects, and workflows for tackling each PDF and its elements, and see how Equidox software makes PDF accessibility easy, even if you have no prior accessibility skills.
Mastering PDF Accessibility Slide Deck
Speak with an expert to learn how Equidox solutions make PDF accessibility easy.