Trends in NLP Research: An ACL 2025 Overview

Introduction

Keeping up with the relentless pace of Natural Language Processing research can feel like drinking from a firehose. So, what are the breakthroughs that truly matter? To find out, members of the ComplyAdvantage data science team recently attended the Association for Computational Linguistics 2025 conference (ACL 2025). We're back with a clear-eyed view of the landscape, and this post summarises the key trends poised to have the biggest impact on the industry and how we build intelligent systems.

Themes

The biggest themes of the conference were LLM Applications, Vector Embeddings, and Synthetic Data. There were salient points raised on LLM Security and LLM Hallucinations.

LLM Security

The highlight of the conference were the discussions around LLM security. The consensus appears to be that the security of agentic systems is fairly poor across the board. “Red-team” security researchers were discussing remote code execution style attacks with 50-100% success rates against example agentic systems. Zero (human) click approaches were also put forward. “Recent” models that are perceived as having strong guardrails were not immune - it appears that moving from a single model call to an agentic framework can significantly degrade the security of the overall application.

This echoes the concerns of some industry practitioners, with some flagging that they could circumvent every guardrail their production LLM currently had in place. The consensus appears to be that model guardrails are not cast-iron defences against models carrying out inappropriate actions. The main recommendation was to use “non-differentiable” defences against LLM-focused attacks (e.g. regex / basic classifier models on user inputs).

In addition, it was flagged that model fine-tuning can significantly damage alignment, and that guardrails are much easier to subvert in low resource languages.

In short: Practitioners have to be very careful about alignment & security in our LLM applications, as the baseline defences in place don't appear to be difficult to circumvent.

Synthetic Data

Another salient trend was the (unintuitive) assertion that training on LLM-created/processed data could be more efficient than training on the raw data itself, with a keynote speech & numerous papers exploring the topic, spanning:

Representing documents as LLM created summaries
Carrying out RAG on LLM-generated queries about each chunk
Training on human-created data that had been rephrased by an LLM (various papers, including Saad et al & Leesombatwathana et al)

In short: Synthetic data may be more useful than raw data for training ML systems.

LLM Applications

It should come as no surprise that research groups are throwing LLMs at every problem under the sun - with examples including bias detection, automatic teachers & computer use agents. LLMs are very much still a state-of-the-art solution for a wide number of academic problems.

Best prompting techniques continue to be a discussion point, with the main takeaways from the industry roundtable being:

Long task descriptions for single, simple tasks gets the best results - chaining multiple tasks generally degrades performance
The best prompts tend to be model-specific
LLMs have better performance when fed & asked for natural language cases, rather than pseudo-json

In short: LLMs continue to be state-of-the-art solutions for many academic problems.

LLM Hallucinations

There is active research into the phenomenon of LLM hallucinations, spanning from how to discourage it in general to predicting if specific outputs are hallucinations. Techniques like RAUQ are asserted to work well at predicting hallucinations in whitebox models. The latest generation of LLMs have some capacity to verbalise how confident they are in an answer, but this underperforms other approaches.

In short: LLM hallucination prediction techniques exist, but many come with significant compute cost or need whitebox models.

Vector Embeddings

Embeddings remain a very active area of research. Applications spanning RAG systems, event resolution, entity search & LLM evaluation were all explored. Interesting work is being carried out on using multilingual embeddings in areas adjacent to entity resolution. Our discussions with researchers regarding how ComplyAdvantage extracts and manages Adverse Media data highlighted their interest in understanding how industry tackles the challenges and methodologies within this domain. This engagement reinforced the importance of mutual learning and open dialogue between industry and academia to advance the practical application of data science in this critical space.

In short: Embedding-based approaches have demonstrated value on many academic problems, including Entity Resolution.

Conclusions

The field is moving from demonstrating that LLMs can solve problems to more mature discussions around the efficiency and safety of LLM- based solutions. The two most impactful trends are the surprising utility of synthetic data for training, and the critical lack of robust LLM security. As we leverage these powerful models, we must keep a close eye on the security & alignment of our applications.

Appendix: Highlighted Papers for Industry Researchers

LLM Mechanics:

Embeddings:

Graph Approaches:

Fast-and-Frugal Text-Graph Transformers are Effective Link Predictors

LLM Applications: