Data & Analytics Insights

Metadata: Powering news discoverability

Alan Francis

Director, News Content and Metadata management

Bill Gates’ 1996 prediction that “Content is king” has proven remarkably accurate in today’s information age. The digital era has witnessed content become a primary driver of value, shaping industries and creating new opportunities for growth and innovation. Yet, as the volume of news and information has exploded, the realisation of this value now depends not just on having high-quality content, but on the ability to efficiently locate it amid a sea of data.

This is where metadata emerges as essential. Metadata—data about data—acts as the guiding beacon that enables users to filter, search, and pinpoint the specific news and insights most relevant to them. Without robust, consistent metadata, even the most valuable content risks being buried and overlooked. As the use of sophisticated technologies like generative AI increases the complexity and volume of news, the accuracy, transparency, and consistency of metadata become ever more critical to empower users to find exactly what they need, when they need it.

LSEG’s continual investment in metadata:

  • Improves the precision of news searches in LSEG Workspace and newsfeed solutions, empowering each user to isolate the news of specific interest to them
  • Leverages the comprehensive, uniform tagging consistently applied across Reuters trusted news and 10,000+ relevant news sources to drive decision making
  • Ensures the continuous expansion of our taxonomy, enhancing discoverability that enables financial professionals to filter news more precisely than ever before and on the latest trending news topics

Real value = coverage + discoverability

LSEG’s Financial News Service brings stories from Reuters trusted news and 10,000+ relevant news sources to financial professionals through both our flagship platform, LSEG Workspace, and our newsfeed solutions, offered across a range of latencies and variety of formats to suit every use case.

But the value of our Financial News Service isn’t just our comprehensive, broad and timely coverage - the metadata (i.e., data about data) that sits behind every piece of news ensures ease of access and precise filtering so that news of specific interest can be efficiently and quickly unearthed.

What is metadata?

Think of the text of a news story as its data. Metadata is the data about that data (e.g., what is the source, which companies and subject matter are discussed, etc.). The metadata empowers you to find the news that interests you in the most efficient way.

Metadata initially comes from one of two sources – it is either supplied by the news provider, e.g., Reuters, Dow Jones, PR Newswire etc., or it is applied by our proprietary intelligent tagging engine using AI techniques. What distinguishes our approach is the broad range of metadata we apply to each story.

We distribute about a million stories per day. As we receive a story from its source (e.g., Reuters, Dow Jones etc.), the story text is first augmented with basic metadata which includes the title/headline, source, date/time received and language. We then enrich the metadata to identify items discussed in the story such as organisations (e.g., public and private companies, central banks, government agencies and ministries), securities, funds, currencies, commodities, facilities and other physical assets. Metadata is then added for relevant industries, geographies, asset classes, security types, corporate actions, KPIs and much more. Relevant subject codes are applied to identify the subject matter in detail and granularity. For economic stories, the economic indicators discussed are identified. Further metadata enrichment includes sentiment, a ‘most-read’ indicator and significance codes (to flag stories of greater significance).

Broad, uniform tagging – a competitive advantage

These rich layers of information transform raw news into a highly discoverable and filterable resource. The full range of our news content is uniformly tagged making the entire news collection highly searchable.

Every day, around 1 million news articles are received and processed. The processing includes metadata tagging plus indexing the story’s text into our text search engine to support searches that can include both metadata and text. All this processing is fully automated at great speed utilising AI techniques like Named Entity Recognition and Machine Learning so that news stories are fully tagged and available for use within seconds of arrival from our sources. The exact same processing happens for research reports, filings and transcripts too.

Tagging is supported for over 13 million organisations. Each story is run through more than 2,000 topic models. Industry classification runs five levels deep and includes more than 1,100 tags. And the numbers keep growing.

A constantly evolving universe of metadata

Questions often raised include: do we have the coverage and is the content discoverable? Both desktop and feed users are looking to find signal amid the noise. Tools like the News Topic Guide in LSEG Workspace (TOPICS) showcase the depth and breadth of our tagging. From asset classes and industries to geographies and corporate actions, our taxonomy is constantly expanding with new topics added monthly to reflect emerging trends and global events.  Another powerful tool is News Digest, a metadata-driven offering that delivers personalised, relevant news available via LSEG Workspace. Feed customers can also tap into deeper metadata like sentiment and relevance and utilise classic metadata signals (e.g., rating upgrade/downgrade, estimate increase/decrease), or develop custom signals based on metadata frequency (e.g., spike in stories about a company) or other metadata anomalies.

Metadata in action

A customer asked us to expand our Real Estate Markets [REAM] topic as it was too broad and they were only interested in commercial real estate, we expanded our taxonomy to provide greater granularity by adding topics for Commercial Real Estate [REAMCO], Residential Real Estate [REAMRE], Industrial Real Estate [REAMIN], and Land (Real Estate) [REAMLA]. When geopolitical tensions arose, we had Military Conflicts [WAR] and Embargoes / Sanctions [TRDEMB] at the ready. We are regularly adding topics for emerging technologies and other trending topics such as the recently added topic, AI Drug Discovery [AIDRUG].

As our taxonomy expands, users will continue to experience an enhanced discoverability solution to filter news more precisely and with greater transparency. At the core of metadata is a powerful hierarchical classification system which expands from broad categories like commodities to more granular like metals to even more granular like copper. This process ensures that every story is not just published but enriched for maximum ease of discovery and effectiveness.

Read more about

Stay updated

Subscribe to an email recap from:

Legal Disclaimer

Republication or redistribution of LSE Group content is prohibited without our prior written consent. 

The content of this publication is for informational purposes only and has no legal effect, does not form part of any contract, does not, and does not seek to constitute advice of any nature and no reliance should be placed upon statements contained herein. Whilst reasonable efforts have been taken to ensure that the contents of this publication are accurate and reliable, LSE Group does not guarantee that this document is free from errors or omissions; therefore, you may not rely upon the content of this document under any circumstances and you should seek your own independent legal, investment, tax and other advice. Neither We nor our affiliates shall be liable for any errors, inaccuracies or delays in the publication or any other content, or for any actions taken by you in reliance thereon.

Copyright © 2025 London Stock Exchange Group. All rights reserved.