AITopics | Hinojosa, Carlos

Collaborating Authors

Hinojosa, Carlos

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Structured-Noise Masked Modeling for Video, Audio and Beyond

Bhowmik, Aritra, Thoker, Fida Mohammad, Hinojosa, Carlos, Ghanem, Bernard, Snoek, Cees G. M.

arXiv.org Artificial IntelligenceMar-20-2025

Masked modeling has emerged as a powerful self-supervised learning framework, but existing methods largely rely on random masking, disregarding the structural properties of different modalities. In this work, we introduce structured noise-based masking, a simple yet effective approach that naturally aligns with the spatial, temporal, and spectral characteristics of video and audio data. By filtering white noise into distinct color noise distributions, we generate structured masks that preserve modality-specific patterns without requiring handcrafted heuristics or access to the data. Our approach improves the performance of masked video and audio modeling frameworks without any computational overhead. Extensive experiments demonstrate that structured noise masking achieves consistent improvement over random masking for standard and advanced masked modeling methods, highlighting the importance of modality-aware masking strategies for representation learning.

artificial intelligence, machine learning, noise, (17 more...)

arXiv.org Artificial Intelligence

2503.16311

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

ColorMAE: Exploring data-independent masking strategies in Masked AutoEncoders

Hinojosa, Carlos, Liu, Shuming, Ghanem, Bernard

arXiv.org Artificial IntelligenceJul-17-2024

Masked AutoEncoders (MAE) have emerged as a robust self-supervised framework, offering remarkable performance across a wide range of downstream tasks. To increase the difficulty of the pretext task and learn richer visual representations, existing works have focused on replacing standard random masking with more sophisticated strategies, such as adversarial-guided and teacher-guided masking. However, these strategies depend on the input data thus commonly increasing the model complexity and requiring additional calculations to generate the mask patterns. This raises the question: Can we enhance MAE performance beyond random masking without relying on input data or incurring additional computational costs? In this work, we introduce a simple yet effective data-independent method, termed ColorMAE, which generates different binary mask patterns by filtering random noise. Drawing inspiration from color noise in image processing, we explore four types of filters to yield mask patterns with different spatial and semantic priors. ColorMAE requires no additional learnable parameters or computational overhead in the network, yet it significantly enhances the learned representations. We provide a comprehensive empirical evaluation, demonstrating our strategy's superiority in downstream tasks compared to random masking. Notably, we report an improvement of 2.72 in mIoU in semantic segmentation tasks relative to baseline MAE implementations.

artificial intelligence, machine learning, noise, (15 more...)

arXiv.org Artificial Intelligence

2407.13036

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Privacy-preserving Optics for Enhancing Protection in Face De-identification

Lopez, Jhon, Hinojosa, Carlos, Arguello, Henry, Ghanem, Bernard

arXiv.org Artificial IntelligenceMar-31-2024

The modern surge in camera usage alongside widespread computer vision technology applications poses significant privacy and security concerns. Current artificial intelligence (AI) technologies aid in recognizing relevant events and assisting in daily tasks in homes, offices, hospitals, etc. The need to access or process personal information for these purposes raises privacy concerns. While software-level solutions like face de-identification provide a good privacy/utility trade-off, they present vulnerabilities to sniffing attacks. In this paper, we propose a hardware-level face de-identification method to solve this vulnerability. Specifically, our approach first learns an optical encoder along with a regression model to obtain a face heatmap while hiding the face identity from the source image. We also propose an anonymization framework that generates a new face using the privacy-preserving image, face heatmap, and a reference face image from a public dataset as input. We validate our approach with extensive simulations and hardware experiments.

computer vision, data mining, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2404.00777

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
(2 more...)

Add feedback

SoccerNet 2023 Challenges Results

Cioppa, Anthony, Giancola, Silvio, Somers, Vladimir, Magera, Floriane, Zhou, Xin, Mkhallati, Hassan, Deliège, Adrien, Held, Jan, Hinojosa, Carlos, Mansourian, Amir M., Miralles, Pierre, Barnich, Olivier, De Vleeschouwer, Christophe, Alahi, Alexandre, Ghanem, Bernard, Van Droogenbroeck, Marc, Kamal, Abdullah, Maglo, Adrien, Clapés, Albert, Abdelaziz, Amr, Xarles, Artur, Orcesi, Astrid, Scott, Atom, Liu, Bin, Lim, Byoungkwon, Chen, Chen, Deuser, Fabian, Yan, Feng, Yu, Fufu, Shitrit, Gal, Wang, Guanshuo, Choi, Gyusik, Kim, Hankyul, Guo, Hao, Fahrudin, Hasby, Koguchi, Hidenari, Ardö, Håkan, Salah, Ibrahim, Yerushalmy, Ido, Muhammad, Iftikar, Uchida, Ikuma, Be'ery, Ishay, Rabarisoa, Jaonary, Lee, Jeongae, Fu, Jiajun, Yin, Jianqin, Xu, Jinghang, Nang, Jongho, Denize, Julien, Li, Junjie, Zhang, Junpei, Kim, Juntae, Synowiec, Kamil, Kobayashi, Kenji, Zhang, Kexin, Habel, Konrad, Nakajima, Kota, Jiao, Licheng, Ma, Lin, Wang, Lizhi, Wang, Luping, Li, Menglong, Zhou, Mengying, Nasr, Mohamed, Abdelwahed, Mohamed, Liashuha, Mykola, Falaleev, Nikolay, Oswald, Norbert, Jia, Qiong, Pham, Quoc-Cuong, Song, Ran, Hérault, Romain, Peng, Rui, Chen, Ruilong, Liu, Ruixuan, Baikulov, Ruslan, Fukushima, Ryuto, Escalera, Sergio, Lee, Seungcheon, Chen, Shimin, Ding, Shouhong, Someya, Taiga, Moeslund, Thomas B., Li, Tianjiao, Shen, Wei, Zhang, Wei, Li, Wei, Dai, Wei, Luo, Weixin, Zhao, Wending, Zhang, Wenjie, Yang, Xinquan, Ma, Yanbiao, Joo, Yeeun, Zeng, Yingsen, Gan, Yiyang, Zhu, Yongqiang, Zhong, Yujie, Ruan, Zheng, Li, Zhiheng, Huang, Zhijian, Meng, Ziyu

arXiv.org Artificial IntelligenceSep-12-2023

The SoccerNet 2023 challenges were the third annual video understanding challenges organized by the SoccerNet team. For this third edition, the challenges were composed of seven vision-based tasks split into three main themes. The first theme, broadcast video understanding, is composed of three high-level tasks related to describing events occurring in the video broadcasts: (1) action spotting, focusing on retrieving all timestamps related to global actions in soccer, (2) ball action spotting, focusing on retrieving all timestamps related to the soccer ball change of state, and (3) dense video captioning, focusing on describing the broadcast with natural language and anchored timestamps. The second theme, field understanding, relates to the single task of (4) camera calibration, focusing on retrieving the intrinsic and extrinsic camera parameters from images. The third and last theme, player understanding, is composed of three low-level tasks related to extracting information about the players: (5) re-identification, focusing on retrieving the same players across multiple views, (6) multiple object tracking, focusing on tracking players and the ball through unedited video streams, and (7) jersey number recognition, focusing on recognizing the jersey number of players from tracklets. Compared to the previous editions of the SoccerNet challenges, tasks (2-3-7) are novel, including new annotations and data, task (4) was enhanced with more data and annotations, and task (6) now focuses on end-to-end approaches. More information on the tasks, challenges, and leaderboards are available on https://www.soccer-net.org. Baselines and development kits can be found on https://github.com/SoccerNet.

artificial intelligence, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2309.06006

Country:

Europe (1.00)
Asia > Japan > Honshū > Kantō (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Sports > Soccer (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Vision > Video Understanding (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Learning Semantic Segmentation with Query Points Supervision on Aerial Images

Rivier, Santiago, Hinojosa, Carlos, Giancola, Silvio, Ghanem, Bernard

arXiv.org Artificial IntelligenceSep-11-2023

Semantic segmentation is crucial in remote sensing, where high-resolution satellite images are segmented into meaningful regions. Recent advancements in deep learning have significantly improved satellite image segmentation. However, most of these methods are typically trained in fully supervised settings that require high-quality pixel-level annotations, which are expensive and time-consuming to obtain. In this work, we present a weakly supervised learning algorithm to train semantic segmentation algorithms that only rely on query point annotations instead of full mask labels. Our proposed approach performs accurate semantic segmentation and improves efficiency by significantly reducing the cost and time required for manual annotation. Specifically, we generate superpixels and extend the query point labels into those superpixels that group similar meaningful semantics. Then, we train semantic segmentation models, supervised with images partially labeled with the superpixels pseudo-labels. We benchmark our weakly supervised training approach on an aerial image dataset and different semantic segmentation architectures, showing that we can reach competitive performance compared to fully supervised training while reducing the annotation effort.

artificial intelligence, machine learning, segmentation, (17 more...)

arXiv.org Artificial Intelligence

2309.0549

Country:

Europe > Netherlands (0.14)
Europe > Germany (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.37)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision > Image Understanding (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Mask-guided Data Augmentation for Multiparametric MRI Generation with a Rare Hepatocellular Carcinoma

Sanchez, Karen, Hinojosa, Carlos, Arias, Kevin, Arguello, Henry, Kouame, Denis, Meyrignac, Olivier, Basarab, Adrian

arXiv.org Artificial IntelligenceJul-30-2023

Data augmentation is classically used to improve the overall performance of deep learning models. It is, however, challenging in the case of medical applications, and in particular for multiparametric datasets. For example, traditional geometric transformations used in several applications to generate synthetic images can modify in a non-realistic manner the patients' anatomy. Therefore, dedicated image generation techniques are necessary in the medical field to, for example, mimic a given pathology realistically. This paper introduces a new data augmentation architecture that generates synthetic multiparametric (T1 arterial, T1 portal, and T2) magnetic resonance images (MRI) of massive macrotrabecular subtype hepatocellular carcinoma with their corresponding tumor masks through a generative deep learning approach. The proposed architecture creates liver tumor masks and abdominal edges used as input in a Pix2Pix network for synthetic data creation. The method's efficiency is demonstrated by training it on a limited multiparametric dataset of MRI triplets from $89$ patients with liver lesions to generate $1,000$ synthetic triplets and their corresponding liver tumor masks. The resulting Frechet Inception Distance score was $86.55$. The proposed approach was among the winners of the 2021 data augmentation challenge organized by the French Society of Radiology.

artificial intelligence, machine learning, tumor mask, (17 more...)

arXiv.org Artificial Intelligence

2307.16314

Country:

Europe > France (0.15)
South America > Colombia (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Diagnostic Medicine (1.00)
Health & Medicine > Therapeutic Area > Oncology > Carcinoma (0.61)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

PrivHAR: Recognizing Human Actions From Privacy-preserving Lens

Hinojosa, Carlos, Marquez, Miguel, Arguello, Henry, Adeli, Ehsan, Fei-Fei, Li, Niebles, Juan Carlos

arXiv.org Artificial IntelligenceJan-29-2023

The accelerated use of digital cameras prompts an increasing concern about privacy and security, particularly in applications such as action recognition. In this paper, we propose an optimizing framework to provide robust visual privacy protection along the human action recognition pipeline. Our framework parameterizes the camera lens to successfully degrade the quality of the videos to inhibit privacy attributes and protect against adversarial attacks while maintaining relevant features for activity recognition.

data mining, machine learning, recognition, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-031-19772-7_19

2206.03891

Country:

South America > Colombia (0.14)
North America > United States (0.14)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.94)

Add feedback