As promised, a quick post on one of the things you can do immediately to make your unstructured data more AI-ready. Simply put, give your unstructured data…a bit of structure.
As a refresher, unstructured data is content that doesn’t exist in neat rows and columns; word processing documents, text files, audio recordings, blog posts, images, emails, meeting transcripts, etc.
Let’s look at Word for an example. One of my old Microsoft colleagues — who supported Office for years — was fond of saying that 80% of Word users utilized only 20% of Word’s functionality. Here’s a piece of functionality your organization should start using regularly.

So Stylish…
That’s the Microsoft Word style picker. Most of your Word document content uses the Normal style, and many users ignore the other styles altogether. A style is essentially a collection of formatting settings. With a single click, you can change font, sizing, spacing, boldfacing, etc.

But the style choice can do something more. The styles add underlying structure, which can be used in document navigation. You can generate a table of contents from them, and see a collapsible outline of your document in the Navigation pane, like this.
And guess what? Your favorite AI tools can see this structure, too. This is a significant help in determining exactly where the chunking boundaries should be, which means information is more likely to be associated with other relevant information, producing more accurate vectors.
Bullets and Lists
Bullet lists and enumerated lists can also be a big asset in chunking, but that statement comes with a caveat: such lists can also hurt. The key is the chunking strategy; it must be set to respect the structure and recognize that all items in a list should be kept together.
Even when the content author doesn’t have influence over the chunking strategy, she can maximize the likelihood that her lists are a benefit to chunking by using styles to create structure around the bullets/enumerations. Recognizable headers before and after the list tell the chunking process that all items on this list belong together.
Call To Action
The best time to plant a tree is twenty years ago, right? To that end, now is a great time to implement best practices with your unstructured data, even if your organization hasn’t implemented a single query in AI.
Your network drives and SharePoint sites are probably full of key information, and your team members’ heads have even more of that institutional knowledge that you’ve been meaning to get “on paper” for years but never really committed the time.
Commit it now. Create a few templates using style structure, then allocate productive time for your team to revise old documents and create new ones. Establishing best practices now will make your AI investment of the future far more likely to generate a healthy return.

Leave a comment