ArXiv-Mcp-Server

3 months ago 4

A comprehensive Model Context Protocol (MCP) implementation for interacting with arXiv.org, enabling AI assistants to search, retrieve, analyze, and export academic papers seamlessly.

Paper Search: Advanced search with keywords, authors, titles, and categories
Paper Retrieval: Get detailed metadata for specific arXiv papers
Smart Summaries: Generate formatted paper summaries with key information

Author Search: Find all papers by specific researchers
Category Browsing: Explore papers in specific arXiv categories (cs.AI, cs.LG, etc.)
Recent Papers: Get latest publications with customizable time ranges
Paper Comparison: Side-by-side analysis of multiple papers
Related Papers: Discover papers related to a given work using intelligent matching
Citation Analysis: Estimated citation metrics and impact analysis
Trend Analysis: Research trend analysis with publication counts, top authors, and keyword frequency
Multi-format Export: Export papers in BibTeX, JSON, CSV, and Markdown formats

Using Poetry (Recommended)

# Clone the repository git clone https://github.com/yourusername/arxiv-mcp-client.git cd arxiv-mcp-client # Install dependencies poetry install # Activate virtual environment poetry shell

# Install dependencies pip install mcp httpx # Run the client python arxiv_mcp_client.py

import asyncio from arxiv_mcp_client import ArxivMCPClient async def main(): # Initialize client client = ArxivMCPClient() # Connect to MCP server await client.connect("arxiv_mcp_server.py") # Your code here... # Close connection await client.close() asyncio.run(main())

# General search results = await client.search_papers("transformer neural networks", max_results=10) # Search by author papers = await client.search_by_author("Geoffrey Hinton", max_results=20) # Search by category ai_papers = await client.search_by_category("cs.AI", max_results=15) # Get recent papers recent = await client.get_recent_papers("cs.LG", days_back=7, max_results=10)

# Get paper details paper = await client.get_paper_details("2301.07041") # Get formatted summary summary = await client.get_paper_summary("2301.07041") # Compare papers comparison = await client.compare_papers( ["2301.07041", "2302.13971"], comparison_fields=["authors", "abstract", "categories"] ) # Find related papers related = await client.find_related_papers("2301.07041", max_results=10)

# Export to BibTeX bibtex = await client.export_papers( ["2301.07041", "2302.13971"], format="bibtex", include_abstract=True ) # Analyze research trends trends = await client.analyze_trends( category="cs.AI", time_period="3_months", analysis_type="top_authors" )

Tool Description Key Parameters

search_arxiv	General paper search	query, max_results, sort_by
get_paper	Get specific paper details	arxiv_id
summarize_paper	Generate formatted summary	arxiv_id
search_by_author	Find papers by author	author_name, max_results
search_by_category	Browse category papers	category, max_results, sort_by
get_recent_papers	Get latest publications	category, days_back, max_results
compare_papers	Compare multiple papers	arxiv_ids, comparison_fields
find_related_papers	Discover related work	arxiv_id, max_results
get_paper_citations	Citation analysis	arxiv_id
analyze_trends	Research trend analysis	category, time_period, analysis_type
export_papers	Multi-format export	arxiv_ids, format, include_abstract

cs.AI - Artificial Intelligence
cs.LG - Machine Learning
cs.CV - Computer Vision
cs.CL - Computation and Language
cs.CR - Cryptography and Security
cs.DB - Databases
cs.DS - Data Structures and Algorithms

math.CO - Combinatorics
math.ST - Statistics Theory
math.PR - Probability
math.OC - Optimization and Control

physics.comp-ph - Computational Physics
physics.data-an - Data Analysis
quant-ph - Quantum Physics

See full category list

@article{2301_07041, title={Title of the Paper}, author={Author One and Author Two}, journal={arXiv preprint arXiv:2301.07041}, year={2023}, url={https://arxiv.org/abs/2301.07041} }

{ "id": "2301.07041", "title": "Title of the Paper", "authors": ["Author One", "Author Two"], "abstract": "Paper abstract...", "published": "2023-01-17", "categories": ["cs.AI", "cs.LG"] }

id,title,authors,published,categories,url 2301.07041,Title of the Paper,"Author One; Author Two",2023-01-17,"cs.AI; cs.LG",https://arxiv.org/abs/2301.07041

# Keyword search await client.search_papers("neural networks") # Multiple keywords await client.search_papers("transformer attention mechanism") # Exact phrase await client.search_papers('"large language models"')

# Author search await client.search_by_author("Yoshua Bengio") # Category search await client.search_by_category("cs.AI") # Recent papers in category await client.get_recent_papers("cs.LG", days_back=14)

Publication Count Analysis

Tracks the number of papers published over time in a specific category.

trends = await client.analyze_trends("cs.AI", "6_months", "publication_count") # Returns: monthly publication counts

Identifies the most prolific authors in a field over a given period.

trends = await client.analyze_trends("cs.LG", "1_year", "top_authors") # Returns: authors ranked by paper count

Keyword Frequency Analysis

Analyzes the most common keywords in paper titles within a category.

trends = await client.analyze_trends("cs.CV", "3_months", "keyword_frequency") # Returns: keywords ranked by frequency

The client includes comprehensive error handling:

try: results = await client.search_papers("machine learning") if "error" in results: print(f"Search failed: {results['error']}") else: print(f"Found {len(results['papers'])} papers") except Exception as e: print(f"Connection error: {e}")

The client respects arXiv's API guidelines with built-in rate limiting and timeout handling.

# Custom timeout client = ArxivMCPClient() client.http_client = httpx.AsyncClient(timeout=60.0) # Custom result limits results = await client.search_papers("AI", max_results=50) # Max: 100