Allganize Launches Finance LLM Leaderboard and Test Dataset for Model Evaluation

Allganize introduces a Finance LLM Leaderboard and test dataset for precise evaluation of finance-specialized language models, enhancing productivity and accuracy in financial tasks.

Allganize has released a finance LLM leaderboard and test dataset. This allows finance enterprises to efficiently evaluate LLM models suitable for their work. The Finance LLM Leaderboard evaluates the performance of LLMs specializing in financial terminology and complex reasoning, and the test dataset specializes in understanding financial documents, calculating formulas, and analyzing tables/charts. It is expected that financial companies will be able to conduct efficient performance evaluations when introducing LLM.

- Finance professional LLM leaderboard, test dataset containing financial project know-how released. You can compare answers generated by LLM and check rankings on a transparent platform.

- Launch of 'Alpha-F (Finance)', a finance-specialized LLM trained on 1 million financial data points. Planned to expand to 'Alpha-M' (manufacturing), 'Alpha-G' (government), etc. in the future.

- Finance players will be able to select LLMs that understand financial terminology better than general-purpose models and experience increased productivity through the introduction of AI.

Allganize, an all-in-one LLM and AI solution company, has released a finance LLM leaderboard. The LLM Leaderboard is a platform that measures, ranks, and evaluates the performance of artificial intelligence language models. Anyone can register their own LLM and compete with other models.

Allganize's Finance LLM Leaderboard evaluates the performance of LLMs specializing in understanding financial terms and abbreviations and complex reasoning. Although general LLM is convenient for general use, it is not specialized for complex reasoning, including formula calculations and exception conditions required in finance. Due to the nature of financial documents, general-purpose LLMs are also weak at understanding tables and charts that emphasize figures and trends.

Allganize’s finance LLM leaderboard lets practitioners immediately compare LLMs that are suitable for their financial documents and work style.

Allganize has also released all of its self-produced test datasets so that financial institutions can compare and evaluate the performance of finance-specific LLM models. Currently, 13 models of general and finance-specialized LLM, including OpenAI's GPT-4, Claude-3, and Gemini, are competing. Three of these are Allganize's own models that have been fine-tuned using specialized financial data - the models with names starting with Alpha-F on the leaderboard.

Users can also directly compare LLM-generated answers to finance-related questions in the Finance LLM Arena. Two randomly selected anonymous LLMs generate answers, and when the user selects the better answer, the model's name is revealed. Answer preferences are reflected in real time, so you can immediately check the rankings between models. As of June 6, the ranking is Claude 3, GPT-4, Alpha-F (EEVE), and Alpha-F (OpenSolarKO).

Real-time ranking of answer preferences on the Allganize Finance LLM Leaderboard

‍

Allganize has released all test data sets considering the fact that it is difficult to properly evaluate the performance of which language model is suitable when financial customers want to introduce LLM.

Currently, the test dataset produced and released by Allganize is RAG (Retrieval Augmented Generation) data produced based on economic research reports, financial reports, financial glossaries of public institutions, and financial documents. It consists of frequently appearing formulas, complex tables, and chart-specific data. Datasets from multiple languages were taken, including English and Korean and Allganize also created their own data containing complex finance-related formulas and tables.

Allganize recently changed the official name of Ali Finance LLM, a finance-specialized AI language model, to ‘Alpha-F (Alpha-Finance)’. Alpha-F has the advantage of understanding complex financial terms and abbreviations as it was trained with 1 million pieces of data specialized on the finance industry, and also includes 200,000 pieces of RAG (Retrieval Augmented Generation) data. In the future, Allganize plans to release industry-specific LLMs such as 'Alpha-M (Manufacturing)' and 'Alpha-G (Government)'.

Allganize also introduced the ‘Finance LLM App Market’, which specializes in automating financial tasks. By using an LLM app that applies AI cognitive search solutions, you can quickly identify financial information that is difficult to understand in a short period of time. For example, you can create a corporate LLM app that answers user questions, such as searching for bank dispute cases, based on financial company documents, manuals, and latest information. You can choose the app you need among the LLM apps registered on the app market and use it right away in your work, or you can select the LLM you want and create your own AI-enabled apps, targeted at your specific business and workflows.

Changsu Lee, CEO of Allganize, said, "When carrying out projects with finance companies, many customers wanted an objective LLM performance evaluation. Performance evaluation requires finance-specific test data, but it takes a lot of time and money for companies to create the datasets. “There is a problem,” he said. “We want to help companies introduce a competitive LLM efficiently by disclosing all data containing Allganize’s know-how and increase work productivity by introducing an LLM that understands financial terms better than a general-purpose model. You can experience the difference,” he said.

If you're curious about how to optimize your company's finance operations, contact Alllganize!

Learn more about LLM apps for businesses you can start using today.

‍

[Allganize] Finance-specific LLM leaderboard and test dataset released

Jobs & Careers

UI Developer

UI Developer

UI Developer

[Allganize] Finance-specific LLM leaderboard and test dataset released

Start Building New Websites

Enterprise AI in Action