AITopics | Kiyavash, Negar

Collaborating Authors

Kiyavash, Negar

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

On Identifiability of Conditional Causal Effects

Kivva, Yaroslav, Etesami, Jalal, Kiyavash, Negar

arXiv.org Artificial IntelligenceJun-19-2023

We address the problem of identifiability of an arbitrary conditional causal effect given both the causal graph and a set of any observational and/or interventional distributions of the form $Q[S]:=P(S|do(V\setminus S))$, where $V$ denotes the set of all observed variables and $S\subseteq V$. We call this problem conditional generalized identifiability (c-gID in short) and prove the completeness of Pearl's $do$-calculus for the c-gID problem by providing sound and complete algorithm for the c-gID problem. This work revisited the c-gID problem in Lee et al. [2020], Correa et al. [2021] by adding explicitly the positivity assumption which is crucial for identifiability. It extends the results of [Lee et al., 2019, Kivva et al., 2022] on general identifiability (gID) which studied the problem for unconditional causal effects and Shpitser and Pearl [2006b] on identifiability of conditional causal effects given merely the observational distribution $P(\mathbf{V})$ as our algorithm generalizes the algorithms proposed in [Kivva et al., 2022] and [Shpitser and Pearl, 2006b].

artificial intelligence, machine learning, realization, (17 more...)

arXiv.org Artificial Intelligence

2306.11755

Country:

Europe (0.46)
North America > United States (0.27)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)
Information Technology > Artificial Intelligence > Machine Learning (0.67)

Add feedback

Causal Imitability Under Context-Specific Independence Relations

Jamshidi, Fateme, Akbari, Sina, Kiyavash, Negar

arXiv.org Artificial IntelligenceJun-11-2023

Drawbacks of ignoring the causal mechanisms when performing imitation learning have recently been acknowledged. Several approaches both to assess the feasibility of imitation and to circumvent causal confounding and causal misspecifications have been proposed in the literature. However, the potential benefits of the incorporation of additional information about the underlying causal structure are left unexplored. An example of such overlooked information is context-specific independence (CSI), i.e., independence that holds only in certain contexts. We consider the problem of causal imitation learning when CSI relations are known. We prove that the decision problem pertaining to the feasibility of imitation in this setting is NP-hard. Further, we provide a necessary graphical criterion for imitation learning under CSI and show that under a structural assumption, this criterion is also sufficient. Finally, we propose a sound algorithmic approach for causal imitation learning which takes both CSI relations and data into account.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2306.00585

Country:

Europe > Switzerland (0.28)
North America > United States > California (0.28)

Genre: Research Report (0.50)

Industry: Automobiles & Trucks (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.46)

Add feedback

Causal Bandits without Graph Learning

Konobeev, Mikhail, Etesami, Jalal, Kiyavash, Negar

arXiv.org Artificial IntelligenceJun-8-2023

We study the causal bandit problem when the causal graph is unknown and develop an efficient algorithm for finding the parent node of the reward node using atomic interventions. We derive the exact equation for the expected number of interventions performed by the algorithm and show that under certain graphical conditions it could perform either logarithmically fast or, under more general assumptions, slower but still sublinearly in the number of variables. We formally show that our algorithm is optimal as it meets the universal lower bound we establish for any algorithm that performs atomic interventions. Finally, we extend our algorithm to the case when the reward node has multiple parents. Using this algorithm together with a standard algorithm from bandit literature leads to improved regret bounds.

data mining, machine learning, node, (20 more...)

arXiv.org Artificial Intelligence

2301.11401

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)

Add feedback

Learning Causal Graphs via Monotone Triangular Transport Maps

Akbari, Sina, Ganassali, Luca, Kiyavash, Negar

arXiv.org Artificial IntelligenceMay-26-2023

We study the problem of causal structure learning from data using optimal transport (OT). Specifically, we first provide a constraint-based method which builds upon lower-triangular monotone parametric transport maps to design conditional independence tests which are agnostic to the noise distribution. We provide an algorithm for causal discovery up to Markov Equivalence with no assumptions on the structural equations/noise distributions, which allows for settings with latent variables. Our approach also extends to score-based causal discovery by providing a novel means for defining scores. This allows us to uniquely recover the causal graph under additional identifiability and structural assumptions, such as additive noise or post-nonlinear models. We provide experimental results to compare the proposed approach with the state of the art on both synthetic and real-world datasets.

artificial intelligence, causal discovery, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2305.1821

Country: North America > United States (0.67)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)

Add feedback

Stochastic Second-Order Methods Improve Best-Known Sample Complexity of SGD for Gradient-Dominated Function

Masiha, Saeed, Salehkaleybar, Saber, He, Niao, Kiyavash, Negar, Thiran, Patrick

arXiv.org Artificial IntelligenceJan-20-2023

We study the performance of Stochastic Cubic Regularized Newton (SCRN) on a class of functions satisfying gradient dominance property with $1\le\alpha\le2$ which holds in a wide range of applications in machine learning and signal processing. This condition ensures that any first-order stationary point is a global optimum. We prove that the total sample complexity of SCRN in achieving $\epsilon$-global optimum is $\mathcal{O}(\epsilon^{-7/(2\alpha)+1})$ for $1\le\alpha< 3/2$ and $\mathcal{\tilde{O}}(\epsilon^{-2/(\alpha)})$ for $3/2\le\alpha\le 2$. SCRN improves the best-known sample complexity of stochastic gradient descent. Even under a weak version of gradient dominance property, which is applicable to policy-based reinforcement learning (RL), SCRN achieves the same improvement over stochastic policy gradient methods. Additionally, we show that the average sample complexity of SCRN can be reduced to ${\mathcal{O}}(\epsilon^{-2})$ for $\alpha=1$ using a variance reduction method with time-varying batch sizes. Experimental results in various RL settings showcase the remarkable performance of SCRN compared to first-order methods.

artificial intelligence, gradient dominance property, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2205.12856

Country: Europe > Switzerland (0.46)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)

Add feedback

Causal Discovery in Linear Structural Causal Models with Deterministic Relations

Yang, Yuqin, Nafea, Mohamed, Ghassami, AmirEmad, Kiyavash, Negar

arXiv.org Artificial IntelligenceOct-30-2021

Linear structural causal models (SCMs) -- in which each observed variable is generated by a subset of the other observed variables as well as a subset of the exogenous sources -- are pervasive in causal inference and casual discovery. However, for the task of causal discovery, existing work almost exclusively focus on the submodel where each observed variable is associated with a distinct source with non-zero variance. This results in the restriction that no observed variable can deterministically depend on other observed variables or latent confounders. In this paper, we extend the results on structure learning by focusing on a subclass of linear SCMs which do not have this property, i.e., models in which observed variables can be causally affected by any subset of the sources, and are allowed to be a deterministic function of other observed variables or latent confounders. This allows for a more realistic modeling of influence or information propagation in systems. We focus on the task of causal discovery form observational data generated from a member of this subclass. We derive a set of necessary and sufficient conditions for unique identifiability of the causal structure. To the best of our knowledge, this is the first work that gives identifiability results for causal discovery under both latent confounding and deterministic relationships. Further, we propose an algorithm for recovering the underlying causal structure when the aforementioned conditions are satisfied. We validate our theoretical results both on synthetic and real datasets.

artificial intelligence, machine learning, possible parent, (15 more...)

arXiv.org Artificial Intelligence

2111.00341

Country:

Europe (0.28)
Asia > China (0.27)

Genre: Research Report (1.00)

Industry: Banking & Finance > Trading (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)

Add feedback

The Complexity of Nonconvex-Strongly-Concave Minimax Optimization

Zhang, Siqi, Yang, Junchi, Guzmán, Cristóbal, Kiyavash, Negar, He, Niao

arXiv.org Machine LearningMar-29-2021

This paper studies the complexity for finding approximate stationary points of nonconvex-strongly-concave (NC-SC) smooth minimax problems, in both general and averaged smooth finite-sum settings. We establish nontrivial lower complexity bounds of $\Omega(\sqrt{\kappa}\Delta L\epsilon^{-2})$ and $\Omega(n+\sqrt{n\kappa}\Delta L\epsilon^{-2})$ for the two settings, respectively, where $\kappa$ is the condition number, $L$ is the smoothness constant, and $\Delta$ is the initial gap. Our result reveals substantial gaps between these limits and best-known upper bounds in the literature. To close these gaps, we introduce a generic acceleration scheme that deploys existing gradient-based methods to solve a sequence of crafted strongly-convex-strongly-concave subproblems. In the general setting, the complexity of our proposed algorithm nearly matches the lower bound; in particular, it removes an additional poly-logarithmic dependence on accuracy present in previous works. In the averaged smooth finite-sum setting, our proposed algorithm improves over previous algorithms by providing a nearly-tight dependence on the condition number.

algorithm, artificial intelligence, optimization problem, (14 more...)

arXiv.org Machine Learning

2103.15888

Country:

North America > United States > Illinois (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Add feedback

A Recursive Markov Blanket-Based Approach to Causal Structure Learning

Mokhtarian, Ehsan, Akbari, Sina, Ghassami, AmirEmad, Kiyavash, Negar

arXiv.org Artificial IntelligenceOct-10-2020

One of the main approaches for causal structure learning is constraint-based methods. These methods are particularly valued as they are guaranteed to asymptotically find a structure which is statistically equivalent to the ground truth. However, they may require exponentially large number of conditional independence (CI) tests in the number of variables of the system. In this paper, we propose a novel recursive constraint-based method for causal structure learning. The key idea of the proposed approach is to recursively use Markov blanket information in order to identify a variable that can be removed from the set of variables without changing the statistical relations among the remaining variables. Once such a variable is found, its neighbors are identified, the removable variable is removed, and the Markov blanket information of the remaining variables is updated. Our proposed approach reduces the required number of conditional independence tests for structure learning compared to the state of the art. We also provide a lower bound on the number of CI tests required by any constraint-based method. Comparing this lower bound to our achievable bound demonstrates the efficiency of our approach. We evaluate and compare the performance of the proposed method on both synthetic and real world structures against the state of the art.

artificial intelligence, ci test, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2010.04992

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

Add feedback

LazyIter: A Fast Algorithm for Counting Markov Equivalent DAGs and Designing Experiments

AhmadiTeshnizi, Ali, Salehkaleybar, Saber, Kiyavash, Negar

arXiv.org Artificial IntelligenceJun-17-2020

The causal relationships among a set of random variables are commonly represented by a Directed Acyclic Graph (DAG), where there is a directed edge from variable $X$ to variable $Y$ if $X$ is a direct cause of $Y$. From the purely observational data, the true causal graph can be identified up to a Markov Equivalence Class (MEC), which is a set of DAGs with the same conditional independencies between the variables. The size of an MEC is a measure of complexity for recovering the true causal graph by performing interventions. We propose a method for efficient iteration over possible MECs given intervention results. We utilize the proposed method for computing MEC sizes and experiment design in active and passive learning settings. Compared to previous work for computing the size of MEC, our proposed algorithm reduces the time complexity by a factor of $O(n)$ for sparse graphs where $n$ is the number of variables in the system. Additionally, integrating our approach with dynamic programming, we design an optimal algorithm for passive experiment design. Experimental results show that our proposed algorithms for both computing the size of MEC and experiment design outperform the state of the art.

artificial intelligence, graph, optimization problem, (13 more...)

arXiv.org Artificial Intelligence

2006.0967

Country:

Europe > Switzerland (0.14)
Asia > Middle East > Iran (0.14)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.34)

Add feedback

Characterizing Distribution Equivalence for Cyclic and Acyclic Directed Graphs

Ghassami, AmirEmad, Zhang, Kun, Kiyavash, Negar

arXiv.org Machine LearningOct-28-2019

The main way for defining equivalence among acyclic directed graphs is based on the conditional independencies of the distributions that they can generate. However, it is known that when cycles are allowed in the structure, conditional independence is not a suitable notion for equivalence of two structures, as it does not reflect all the information in the distribution that can be used for identification of the underlying structure. In this paper, we present a general, unified notion of equivalence for linear Gaussian directed graphs. Our proposed definition for equivalence is based on the set of distributions that the structure is able to generate. We take a first step towards devising methods for characterizing the equivalence of two structures, which may be cyclic or acyclic. Additionally, we propose a score-based method for learning the structure from observational data.

artificial intelligence, machine learning, rotation, (16 more...)

arXiv.org Machine Learning

1910.12993

Country: North America > United States (0.46)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.89)

Add feedback