Plotting

 Education



I'm a Public-School English Teacher. The Most Vocal Defenders of K–12 Liberal Arts Are Not Who You'd Expect.

Slate

Sign up for the Slatest to get the most insightful analysis, criticism, and advice out there, delivered to your inbox daily. On May 6, the Texas House Committee on Public Education discussed S.B. 13, a bill seeking to remove from public school libraries and classrooms all "profane" and "indecent content." At the hearing, Republican Rep. Terri Leo-Wilson focused on the concern that the legislation could harm the transmission of cultural heritage by depriving students of "classics." She explained, using an adjective that in our current culture wars has come to describe a type of humanities education favored by conservatives, that her "kids were classically trained, so they had their graduation picture with all sorts of books … classic works of literature." When an activist commenting during the hearing remarked that among renowned writers, Toni Morrison's work is singularly "very sexualized," Leo-Wilson replied, without reference to any one book, "She might be famous, but that's not considered, I don't think, a classic."


Let's Talk About ChatGPT and Cheating in the Classroom

WIRED

There's been a lot of talk about how AI tools like ChatGPT are changing education. Students are using AI to do research, write papers, and get better grades. So today on the show, we debate whether using AI in school is actually cheating. Plus, we dive into how students and teachers are using these tools, and we ask what place AI should have in the future of learning. Write to us at uncannyvalley@wired.com.


Let Images Give You More: Point Cloud Cross-Modal Training for Shape Analysis

Neural Information Processing Systems

Although recent point cloud analysis achieves impressive progress, the paradigm of representation learning from a single modality gradually meets its bottleneck. In this work, we take a step towards more discriminative 3D point cloud representation by fully taking advantages of images which inherently contain richer appearance information, e.g., texture, color, and shade. Specifically, this paper introduces a simple but effective point cloud cross-modality training (PointCMT) strategy, which utilizes view-images, i.e., rendered or projected 2D images of the 3D object, to boost point cloud analysis. In practice, to effectively acquire auxiliary knowledge from view images, we develop a teacher-student framework and formulate the crossmodal learning as a knowledge distillation problem. PointCMT eliminates the distribution discrepancy between different modalities through novel feature and classifier enhancement criteria and avoids potential negative transfer effectively. Note that PointCMT effectively improves the point-only representation without architecture modification. Sufficient experiments verify significant gains on various datasets using appealing backbones, i.e., equipped with PointCMT, PointNet++ and PointMLP achieve state-of-the-art performance on two benchmarks, i.e., 94.4% and 86.7% accuracy on ModelNet40 and ScanObjectNN, respectively. Code will be made available at https://github.com/ZhanHeshen/PointCMT.


Forget Cocomelon--this kids' app won't rot their brains

Popular Science

If your child loves their tablet, but you struggle with finding appropriate games, try Pok Pok, a learning app for kids aged 2-8 that doesn't feel like learning. It features a collection of calming, open-ended digital toys that help children explore STEM, problem-solving, creativity, and more without ads, in-app purchases, or overstimulation. Built by parents in collaboration with early childhood experts, Pok Pok offers a Montessori-inspired experience that supports healthy screen time and lifelong learning. Kids using Pok Pok build foundational skills in STEM, problem-solving, language, numbers, cause and effect, and emotional development. Each game is open-ended, so there's no "winning" or "losing."


Supplementary Material A Derivations and Further Technical Details 15 A.1 Proof of Proposition 1

Neural Information Processing Systems

Following Haarnoja et al. [13], we can now rewrite Equation (A.4) as [ ( J A.3 Regularized Maximum Likelihood Estimation To address the collapse in predictive variance away from the offline dataset under MLE training seen in Figure 1, Wu et al. [51] in practice augment the usual MLE loss with an entropy bonus as follows: π Whilst entropy regularization partially mitigates the collapse of predictive variance away from the expert demonstrations, we still observe the wrong trend similar to Figure 1 with predictive variances high near the expert demonstrations and low on unseen data. The variance surface also becomes more poorly behaved, with "islands" of high predictive variance appearing away from the data. Figure 12 shows the predictive variances of behavioral policies trained on expert demonstrations for the "door-binary-v0" environment with varying Tikhonov regularization coefficients λ. Similarly, Tikhonov regularization does not resolve the issue with calibration of uncertainties. We also observe that too high a regularization strength causes the model to underfit to the variances of the data.


On Pathologies in KL-Regularized Reinforcement Learning from Expert Demonstrations

Neural Information Processing Systems

KL-regularized reinforcement learning from expert demonstrations has proved successful in improving the sample efficiency of deep reinforcement learning algorithms, allowing them to be applied to challenging physical real-world tasks. However, we show that KL-regularized reinforcement learning with behavioral reference policies derived from expert demonstrations can suffer from pathological training dynamics that can lead to slow, unstable, and suboptimal online learning. We show empirically that the pathology occurs for commonly chosen behavioral policy classes and demonstrate its impact on sample efficiency and online policy performance. Finally, we show that the pathology can be remedied by non-parametric behavioral reference policies and that this allows KL-regularized reinforcement learning to significantly outperform state-of-the-art approaches on a variety of challenging locomotion and dexterous hand manipulation tasks.


The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale

Neural Information Processing Systems

The performance of a large language model (LLM) depends heavily on the quality and size of its pretraining dataset. However, the pretraining datasets for state-ofthe-art open LLMs like Llama 3 and Mixtral are not publicly available and very little is known about how they were created. In this work, we introduce FineWeb, a 15-trillion token dataset derived from 96 Common Crawl snapshots that produces better-performing LLMs than other open pretraining datasets. To advance the understanding of how best to curate high-quality pretraining datasets, we carefully document and ablate all of the design choices used in FineWeb, including indepth investigations of deduplication and filtering strategies. In addition, we introduce FineWeb-Edu, a 1.3-trillion token collection of educational text filtered from FineWeb.


Unsupervised Representation Transfer for Small Networks: I Believe I Can Distill On-the-Fly

Neural Information Processing Systems

A current remarkable improvement of unsupervised visual representation learning is based on heavy networks with large-batch training. While recent methods have greatly reduced the gap between supervised and unsupervised performance of deep models such as ResNet-50, this development has been relatively limited for small models. In this work, we propose a novel unsupervised learning framework for small networks that combines deep self-supervised representation learning and knowledge distillation within one-phase training. In particular, a teacher model is trained to produce consistent cluster assignments between different views of the same image. Simultaneously, a student model is encouraged to mimic the prediction of on-the-fly self-supervised teacher. For effective knowledge transfer, we adopt the idea of domain classifier so that student training is guided by discriminative features invariant to the representational space shift between teacher and student. We also introduce a network driven multi-view generation paradigm to capture rich feature information contained in the network itself. Extensive experiments show that our student models surpass state-of-the-art offline distilled networks even from stronger self-supervised teachers as well as top-performing self-supervised models. Notably, our ResNet-18, trained with ResNet-50 teacher, achieves 68.3% ImageNet Top-1 accuracy on frozen feature linear evaluation, which is only 1.5% below the supervised baseline.


On Giant's Shoulders: Effortless Weakto Strong by Dynamic Logits Fusion

Neural Information Processing Systems

Efficient fine-tuning of large language models for task-specific applications is imperative, yet the vast number of parameters in these models makes their training increasingly challenging. Despite numerous proposals for effective methods, a substantial memory overhead remains for gradient computations during updates. Can we fine-tune a series of task-specific small models and transfer their knowledge directly to a much larger model without additional training? In this paper, we explore weak-to-strong specialization using logit arithmetic, facilitating a direct answer to this question. Existing weak-to-strong methods often employ a static knowledge transfer ratio and a single small model for transferring complex knowledge, which leads to suboptimal performance.