paper-system/20250101-20250105-cat-cs.AI...

1 line
382 KiB
JSON
Raw Normal View History

2025-01-24 15:26:47 +00:00
[{"title":"Beyond Text: Implementing Multimodal Large Language Model-Powered\n Multi-Agent Systems Using a No-Code Platform","abstract":" This study proposes the design and implementation of a multimodal LLM-based\nMulti-Agent System (MAS) leveraging a No-Code platform to address the practical\nconstraints and significant entry barriers associated with AI adoption in\nenterprises. Advanced AI technologies, such as Large Language Models (LLMs),\noften pose challenges due to their technical complexity and high implementation\ncosts, making them difficult for many organizations to adopt. To overcome these\nlimitations, this research develops a No-Code-based Multi-Agent System designed\nto enable users without programming knowledge to easily build and manage AI\nsystems. The study examines various use cases to validate the applicability of\nAI in business processes, including code generation from image-based notes,\nAdvanced RAG-based question-answering systems, text-based image generation, and\nvideo generation using images and prompts. These systems lower the barriers to\nAI adoption, empowering not only professional developers but also general users\nto harness AI for significantly improved productivity and efficiency. By\ndemonstrating the scalability and accessibility of No-Code platforms, this\nstudy advances the democratization of AI technologies within enterprises and\nvalidates the practical applicability of Multi-Agent Systems, ultimately\ncontributing to the widespread adoption of AI across various industries.\n","arxiv_id":"http://arxiv.org/abs/2501.00750v1","authors":["Cheonsu Jeong"]},{"title":"A3: Android Agent Arena for Mobile GUI Agents","abstract":" AI agents have become increasingly prevalent in recent years, driven by\nsignificant advancements in the field of large language models (LLMs). Mobile\nGUI agents, a subset of AI agents, are designed to autonomously perform tasks\non mobile devices. While numerous studies have introduced agents, datasets, and\nbenchmarks to advance mobile GUI agent research, many existing datasets focus\non static frame evaluations and fail to provide a comprehensive platform for\nassessing performance on real-world, in-the-wild tasks. To address this gap, we\npresent Android Agent Arena (A3), a novel evaluation platform. Unlike existing\nin-the-wild systems, A3 offers: (1) meaningful and practical tasks, such as\nreal-time online information retrieval and operational instructions; (2) a\nlarger, more flexible action space, enabling compatibility with agents trained\non any dataset; and (3) automated business-level LLM-based evaluation process.\nA3 includes 21 widely used general third-party apps and 201 tasks\nrepresentative of common user scenarios, providing a robust foundation for\nevaluating mobile GUI agents in real-world situations and a new autonomous\nevaluation process for less human labor and coding expertise. The project is\navailable at \\url{https://yuxiangchai.github.io/Android-Agent-Arena/}.\n","arxiv_id":"http://arxiv.org/abs/2501.01149v1","authors":["Yuxiang Chai","Hanhao Li","Jiayu Zhang","Liang Liu","Guozhi Wang","Shuai Ren","Siyuan Huang","Hongsheng Li"]},{"title":"Rethinking Relation Extraction: Beyond Shortcuts to Generalization with\n a Debiased Benchmark","abstract":" Benchmarks are crucial for evaluating machine learning algorithm performance,\nfacilitating comparison and identifying superior solutions. However, biases\nwithin datasets can lead models to learn shortcut patterns, resulting in\ninaccurate assessments and hindering real-world applicability. This paper\naddresses the issue of entity bias in relation extraction tasks, where models\ntend to rely on entity mentions rather than context. We propose a debiased\nrelation extraction benchmark DREB that breaks the pseudo-correlation between\nentity mentions and relation types through entity replacement. DREB utilizes\nBias Evaluator and PPL Evaluator to ensure low bias and high naturalness,\nproviding a reliable and accurate assessment of model generalization in entity\nbias scenarios. To esta