AITopics | counterclockwise

Collaborating Authors

counterclockwise

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Towards Understanding Camera Motions in Any Video

Neural Information Processing SystemsJun-22-2026, 16:28:17 GMT

We introduce CameraBench, a large-scale dataset and benchmark designed to assess and improve camera motion understanding. CameraBench consists of 3,000 diverse internet videos, annotated by experts through a rigorous multi-stage quality control process. One of our core contributions is a taxonomy or "language" of camera motion primitives, designed in collaboration with cinematographers. We find, for example, that some primitives like "follow" (or tracking) require understanding scene content like moving subjects. We conduct a large-scale human study to quantify human annotation performance, revealing that domain expertise and tutorial-based training can significantly enhance accuracy. For example, a novice may confuse zoom-in(a change of intrinsics) with translating forward (a change of extrinsics), but can be trained to differentiate the two. Using CameraBench, we evaluate Structure-from-Motion (SfM) and Video-Language Models (VLMs), finding that SfM models struggle to capture semantic primitives that depend on scene content, while VLMs struggle to capture geometric primitives that require precise estimation of trajectories. We then fine-tune a generative VLM on CameraBench to achieve the best of both worlds and showcase its applications, including motion-augmented captioning, video question answering, and video-text retrieval. We hope our taxonomy, benchmark, and tutorials will drive future efforts towards the ultimate goal of understanding camera motions in any video.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (1.00)
Overview (0.92)
Instructional Material (0.67)

Industry:

Media > Photography (1.00)
Media > Film (1.00)

Technology:

Information Technology > Graphics (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

Towards Understanding Camera Motions in Any Video

Neural Information Processing SystemsJun-22-2026, 16:28:13 GMT

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (1.00)
Overview (0.92)
Instructional Material (0.67)

Industry:

Media > Photography (1.00)
Media > Film (1.00)

Technology:

Information Technology > Graphics (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

Strategyproof Facility Location for Five Agents on a Circle using PCD

Farjoun, Ido, Meir, Reshef

arXiv.org Artificial IntelligenceOct-21-2025

We consider the strategyproof facility location problem on a circle. We focus on the case of 5 agents, and find a tight bound for the PCD strategyproof mechanism, which selects the reported location of an agent in proportion to the length of the arc in front of it. We methodically "reduce" the size of the instance space and then use standard optimization techniques to find and prove the bound is tight. Moreover we hypothesize the approximation ratio of PCD for general odd $n$.

agent, artificial intelligence, optimization problem, (17 more...)

arXiv.org Artificial Intelligence

2510.17435

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.49)

Add feedback

Towards Understanding Camera Motions in Any Video

Lin, Zhiqiu, Cen, Siyuan, Jiang, Daniel, Karhade, Jay, Wang, Hewei, Mitra, Chancharik, Ling, Tiffany, Huang, Yuhan, Liu, Sifan, Chen, Mingyu, Zawar, Rushikesh, Bai, Xue, Du, Yilun, Gan, Chuang, Ramanan, Deva

arXiv.org Artificial IntelligenceSep-1-2025

We introduce CameraBench, a large-scale dataset and benchmark designed to assess and improve camera motion understanding. CameraBench consists of ~3,000 diverse internet videos, annotated by experts through a rigorous multi-stage quality control process. One of our contributions is a taxonomy of camera motion primitives, designed in collaboration with cinematographers. We find, for example, that some motions like "follow" (or tracking) require understanding scene content like moving subjects. We conduct a large-scale human study to quantify human annotation performance, revealing that domain expertise and tutorial-based training can significantly enhance accuracy. For example, a novice may confuse zoom-in (a change of intrinsics) with translating forward (a change of extrinsics), but can be trained to differentiate the two. Using CameraBench, we evaluate Structure-from-Motion (SfM) and Video-Language Models (VLMs), finding that SfM models struggle to capture semantic primitives that depend on scene content, while VLMs struggle to capture geometric primitives that require precise estimation of trajectories. We then fine-tune a generative VLM on CameraBench to achieve the best of both worlds and showcase its applications, including motion-augmented captioning, video question answering, and video-text retrieval. We hope our taxonomy, benchmark, and tutorials will drive future efforts towards the ultimate goal of understanding camera motions in any video.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2504.15376

Genre: Research Report (1.00)

Industry:

Media > Photography (1.00)
Media > Film (1.00)

Technology:

Information Technology > Graphics (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

Impact-resistant, autonomous robots inspired by tensegrity architecture

Johnson, William R. III, Huang, Xiaonan, Lu, Shiyang, Wang, Kun, Booth, Joran W., Bekris, Kostas, Kramer-Bottiglio, Rebecca

arXiv.org Artificial IntelligenceJan-25-2025

Future robots will navigate perilous, remote environments with resilience and autonomy. Researchers have proposed building robots with compliant bodies to enhance robustness, but this approach often sacrifices the autonomous capabilities expected of rigid robots. Inspired by tensegrity architecture, we introduce a tensegrity robot -- a hybrid robot made from rigid struts and elastic tendons -- that demonstrates the advantages of compliance and the autonomy necessary for task performance. This robot boasts impact resistance and autonomy in a field environment and additional advances in the state of the art, including surviving harsh impacts from drops (at least 5.7 m), accurately reconstructing its shape and orientation using on-board sensors, achieving high locomotion speeds (18 bar lengths per minute), and climbing the steepest incline of any tensegrity robot (28 degrees). We characterize the robot's locomotion on unstructured terrain, showcase its autonomous capabilities in navigation tasks, and demonstrate its robustness by rolling it off a cliff.

artificial intelligence, gait, robot, (16 more...)

arXiv.org Artificial Intelligence

2501.15078

Country:

North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
North America > United States > Connecticut > New Haven County > New Haven (0.04)

Genre: Research Report (1.00)

Industry: Materials > Chemicals (0.46)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

Autoregressive Large Language Models are Computationally Universal

Schuurmans, Dale, Dai, Hanjun, Zanini, Francesco

arXiv.org Artificial IntelligenceOct-4-2024

We show that autoregressive decoding of a transformer-based language model can realize universal computation, without external intervention or modification of the model's weights. Establishing this result requires understanding how a language model can process arbitrarily long inputs using a bounded context. For this purpose, we consider a generalization of autoregressive decoding where, given a long input, emitted tokens are appended to the end of the sequence as the context window advances. We first show that the resulting system corresponds to a classical model of computation, a Lag system, that has long been known to be computationally universal. By leveraging a new proof, we show that a universal Turing machine can be simulated by a Lag system with 2027 production rules. We then investigate whether an existing large language model can simulate the behaviour of such a universal Lag system. We give an affirmative answer by showing that a single system-prompt can be developed for gemini-1.5-pro-001 that drives the model, under deterministic (greedy) decoding, to correctly apply each of the 2027 production rules. We conclude that, by the Church-Turing thesis, prompted gemini-1.5-pro-001 with extended autoregressive (greedy) decoding is a general purpose computer.

iteration, lag system, language model, (17 more...)

arXiv.org Artificial Intelligence

2410.0317

Country:

North America > Canada > Alberta (0.14)
Europe > Moldova (0.04)
Europe > Ireland (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Probing Mechanical Reasoning in Large Vision Language Models

Sun, Haoran, Gao, Qingying, Lyu, Haiyun, Luo, Dezhi, Deng, Hokin, Li, Yijiang

arXiv.org Artificial IntelligenceSep-30-2024

Mechanical reasoning is a fundamental ability that sets human intelligence apart from other animal intelligence. Mechanical reasoning allows us to design tools, build bridges and canals, and construct houses which set the foundation of human civilization. Embedding machines with such ability is an important step towards building human-level artificial intelligence. Recently, Li et al. built CogDevelop2K, a data-intensive cognitive experiment benchmark for assaying the developmental trajectory of machine intelligence (Li et al., 2024). Here, to investigate mechanical reasoning in Vision Language Models, we leverage the MechBench of CogDevelop2K, which contains approximately 150 cognitive experiments, to test understanding of mechanical system stability, gears and pulley systems, seesaw-like systems and leverage principle, inertia and motion, and other fluid-related systems in Large Vision Language Models. We observe diverse yet consistent behaviors over these aspects in VLMs.

correct answer, pulley system, reasoning, (13 more...)

arXiv.org Artificial Intelligence

2410.00318

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > North Carolina (0.04)
North America > United States > Michigan (0.04)
(3 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.98)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.33)

Add feedback

A Motion Planning Algorithm in a Figure Eight Track

Jardon, Cristian, Sheppard, Brian, Zaveri, Veet

arXiv.org Artificial IntelligenceMar-17-2024

We design a motion planning algorithm to coordinate the movements of two robots along a figure eight track, in such a way that no collisions occur. We use a topological approach to robot motion planning that relates instabilities in motion planning algorithms to topological features of configuration spaces. The topological complexity of a configuration space is an invariant that measures the complexity of motion planning algorithms. We show that the topological complexity of our problem is 3 and construct an explicit algorithm with three continuous instructions.

algorithm, configuration space, robot, (16 more...)

arXiv.org Artificial Intelligence

2403.0557

Country: North America > United States > Illinois > Cook County > Chicago (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (1.00)

Add feedback

Universal Syntactic Structures: Modeling Syntax for Various Natural Languages

Kim, Min K., Takero, Hafu, Fedovik, Sara

arXiv.org Artificial IntelligenceDec-28-2023

We aim to provide an explanation for how the human brain might connect words for sentence formation. A novel approach to modeling syntactic representation is introduced, potentially showing the existence of universal syntactic structures for all natural languages. As the discovery of DNA's double helix structure shed light on the inner workings of genetics, we wish to introduce a basic understanding of how language might work in the human brain. It could be the brain's way of encoding and decoding knowledge. It also brings some insight into theories in linguistics, psychology, and cognitive science. After looking into the logic behind universal syntactic structures and the methodology of the modeling technique, we attempt to analyze corpora that showcase universality in the language process of different natural languages such as English and Korean. Lastly, we discuss the critical period hypothesis, universal grammar, and a few other assertions on language for the purpose of advancing our understanding of the human brain.

syntactic structure, translation, word order, (14 more...)

arXiv.org Artificial Intelligence

2402.01641

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Middlesex County > Malden (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(6 more...)

Genre: Research Report (1.00)

Industry:

Transportation (0.96)
Health & Medicine > Therapeutic Area (0.46)
Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Classification of Orbits in Poincar\'e Maps using Machine Learning

Kamath, Chandrika

arXiv.org Artificial IntelligenceMay-17-2023

The quest for low-cost fusion power has led to the construction of experimental devices such as the DIII-D[8], an operational device for conducting magnetic fusion research, and ITER [16], an international project to help make the transition from studies of plasma physics to electricity-generating fusion power plants. These devices, called tokamaks, use magnetic fields to confine the fusion fuel in the form of a plasma, enabling physicists to perform experiments to determine the best shape for the hot reacting plasma and the magnetic fields necessary to hold it in place. To complement the experiments, computer simulations are used to gain an understanding of the complex physics of the plasmas, design new reactors, and select the parameters to be used in experiments. Data from both the experiments and the simulations are analyzed to provide the insights that will contribute to achieving the goal of fusion power. In this paper, we focus on a specific analysis problem that arises in both simulation and experimental data, namely, the classification of orbits in a Poincaré map, also called a Poincaré plot. These two-dimensional plots are obtained for planes, called poloidal planes, which intersect the torus-shaped tokamak perpendicular to the magnetic axis, as shown in Figure 1(a). A plot consists of several orbits, each composed of a number of points (Figure 1(b)). For a given orbit, these points are the intersections of a field line (the solid lines in Figure 1(a)) with a poloidal plane, as the field line is followed around the torus. There are four distinct shapes traced out by these points, leading to four classes of orbits: quasi-periodic, separatrix, island chain, and stochastic, as shown in Figure 2. In some cases, the orbit shows its distinctive shape with just a few points, corresponding to the first few intersections of the field line with the poloidal plane.

artificial intelligence, machine learning, orbit, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/s41060-022-00368-3

2305.13329

Country:

North America > United States > Massachusetts > Norfolk County > Wellesley (0.04)
North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
North America > United States > California > Alameda County > Livermore (0.04)

Genre: Research Report (0.70)

Industry:

Energy > Power Industry (0.54)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.47)

Add feedback