A library of artificial intelligence focused podcasts, videos, books, blogs, newsletters, and articles from newspapers, magazines, and scholarly journals. Covers over four dozen different areas impacted by LLMs, and generative AI.
Email:angela.cox@uni.edu
If you need to have a meeting with me outside the hours available, send me an email request and suggest other possible dates and times.
Released on January 26, 2025
Why Claude Sonnet 3.5 from Anthropic, Gemini from Google, and ChatGPT from OpenAI are still the best three options for most people.
Chatbot from Microsoft
Powered by ChatGPT4 and features the DALL-E 3 image generator.
To access Microsoft Copilot:
1. Navigate to https://copilot.uni.edu
2. Use CatID@uni.edu as your username. Use your CatID password on the next page to finish your login.
3. Once logged in you will see a shield in the upper right hand corner:
4. If you hover your mouse over the shield, it will show that Enterprise Data Protection is enabled:
5. You are now able to interact with CoPilot with Enterprise Data Protection.
"We present Chatbot Arena, a benchmark platform for large language models (LLMs) that features anonymous, randomized battles in a crowdsourced manner. In this blog post, we are releasing our initial results and a leaderboard based on the Elo rating system, which is a widely-used rating system in chess and other competitive games. We invite the entire community to join this effort by contributing new models and evaluating them by asking questions and voting for your favorite answer."
Massive Multitask Language Understanding (MMLU) (Hendrycks et al, 2020) is a multiple-choice question answering test that covers 57 tasks including elementary mathematics, US history, computer science, law, and more. We publish evaluation results from evaluating various models on MMLU using HELM. Our evaluation results include the following:
Simple, standardized prompts
Accuracy breakdown for each of the 57 subjects
Full transparency of all raw prompts and predictions
As of September 2024, several prominent LLM leaderboards are actively monitoring and evaluating the performance of large language models across a diverse range of benchmarks and tasks. These leaderboards provide invaluable insights into how different models stack up against each other, helping researchers and practitioners understand the strengths and weaknesses of various approaches. Here are some of the top LLM leaderboards you should know about.
"The largest database of 14678 AIs available for over 14337 tasks" ... (as of 8/18/2024). "Use our smart AI search to find the best and latest AI tools for any use case." .... Threre's an AI for That
"You.com's platform empowers users, regardless of their technical expertise, by allowing them to select from a wide array of LLMs and tailor their assistant's behavior through custom instructions." --- Michael Nuñez, VentureBeat, May 30, 2024
"Connected Papers is a unique, visual tool to help researchers and applied scientists find and explore papers relevant to their field of work. To create each graph, we analyze an order of ~50,000 papers and select the few dozen with the strongest connections to the origin paper."
"Semantic Scholar provides free, AI-driven search and discovery tools, and open resources for the global research community. We index over 200 million academic papers sourced from publisher partnerships, data providers, and web crawls."