EdTech Discovery
Hermes

An instrument for spotting the next edtech opportunity — generated ideas, each traced to the real-world signals behind it.

Updated Jun 24, 2026 · 10 ideas · 1304 signals
Admin mode — curation controls visible. Keep this URL (with token) private.

Signals

The evidence library — the raw signals the pipeline is watching across the education ecosystem. Every idea is built from these.

technology Wed, 24 Jun 2026 00:00:00 -0400
arXiv cs.HC

Real-Time Interactive Music Generation via Data-Free Streaming Consistency Distillation

arXiv:2606.24307v1 Announce Type: cross Abstract: Interactive music and live performance relies on real-time human expression, but modern generative music AI remains largely absent from this domain due to its prohibitive inference latency and offline rendering paradigm. To provide pioneer musicians with a novel medium for interactive composition, we should fundamentally change these static models into dynamic, playable instruments. In this paper, we propose a framework that bridges this gap. To achieve the low latency required for live interaction without sacrificing structural coherence, we formulate distillation within a streaming autoregressive latent space. Our approach gets rid of the need for expensive paired audio-latent datasets by utilizing prompt-only inputs to synthesize teacher-guided, chunk-wise trajectories on the fly. Because live instruments require high acoustic fidelity, we introduce music-aware consistency objectives, which combine latent, spectral, and temporal-diff

Source ↗
technology Wed, 24 Jun 2026 00:00:00 -0400
arXiv cs.HC

Dialogue to Discovery: Attribute-Aware Preference Elicitation for Conversational Product Search Assistants

arXiv:2606.24194v1 Announce Type: cross Abstract: Conversational product search assistants offer a more expressive, natural, and interactive alternative to traditional keyword-based product search. With limited screen space, showing only a few items increases the need for precise preference elicitation, which can prolong conversations, leading to user frustration and session abandonment. Conversely, rushing to recommend items without a clear understanding of preferences risks poor matches and a degraded user experience. We present Dialogue to Discovery (D2D), an attribute-oriented preference elicitation framework that dynamically exploits the structure of product attributes to efficiently steer conversations toward the user's desired item. D2D adaptively prioritizes the most informative queries and strategically times product recommendations, reducing premature or off-target suggestions that harm engagement. To evaluate D2D, we curate three datasets from the Amazon Reviews corpus. In s

Source ↗
technology Wed, 24 Jun 2026 00:00:00 -0400
arXiv cs.HC

Aspect-Based Sentiment Evolution and its Correlation with Review Rounds in Multi-Round Peer Reviews: A Deep Learning Approach

arXiv:2606.24188v1 Announce Type: cross Abstract: Mining sentiment information from the textual content of peer review comments offers valuable insights into the scientific evaluation process. However, previous studies are often constrained by coarse-grained analysis and the lack of differentiation across review rounds. Notably, the dynamic shifts in reviewers' focus and sentiment tendencies throughout multiple review stages remain underexplored. To address this gap, the present study investigates the distribution and evolution of aspect-level sentiments and examines their correlation with the number of review rounds. We begin by segmenting the multi-round review comments of 11,063 accepted papers from Nature Communications and identifying fine-grained review aspect clusters. A manually annotated corpus of approximately 5,000 review sentences is then constructed. Using this dataset, we train a series of deep learning-based aspect sentiment classification models. Among them, the LCF-BER

Source ↗
technology Wed, 24 Jun 2026 00:00:00 -0400
arXiv cs.HC

Ten Digits on a Train: AI-Assisted Verification of Two Eigenvalue Problems

arXiv:2606.23821v1 Announce Type: cross Abstract: Accurate numerical eigenvalues are often difficult to certify, especially in singular or non-normal settings. This article reports a human--AI collaboration on two such computations. For a singular self-adjoint Schr\"odinger operator, a verified zero count and Dirichlet--Neumann bracketing certify the complete negative spectrum to ten decimal places. For a delicate non-normal atom--molecule benchmark, a previously unresolved resonance pair is separated, with each member enclosed to ten digits. The second result is achieved not by increasing the precision of one-way shooting, but by reformulating the problem as a global matching system for projective solution lines. The infinite tail is encoded as uncertainty in the terminal projective data, and a componentwise, tail-robust Krawczyk--Brouwer inclusion supplies the certificate. This gives a reusable architecture for analytic boundary-value systems with ill-conditioned propagation and unce

Source ↗
technology Wed, 24 Jun 2026 00:00:00 -0400
arXiv cs.HC

EvidenceLens: A Claim-Evidence Matrix for Auditing Financial Question Answering

arXiv:2606.23724v1 Announce Type: cross Abstract: Large language models are increasingly used to answer questions over annual reports, earnings decks, and analyst notes, yet their outputs remain difficult to verify in high-stakes financial workflows. A fluent answer can blend directly grounded statements, weak synthesis, and unsupported claims across narrative text, tables, and charts. We present EvidenceLens, a visual analytics prototype that treats financial question answering as a claim-evidence alignment problem. The system decomposes an answer into atomic claims, summarizes support composition and confidence, support gaps, and coordinates claim-level inspection with source passages, table cells, and chart regions. Its core visual representation is a multimodal claim-evidence matrix that makes coverage, contradiction, and modality imbalance immediately visible. To support reproducibility, we also specify a JSON-based artifact schema, a lightweight multimodal alignment pipeline, and

Source ↗
technology Wed, 24 Jun 2026 00:00:00 -0400
arXiv cs.HC

Zero-Shot Neural Priors for Generalizable Cross-Subject and Cross-Task EEG Decoding

arXiv:2606.23706v1 Announce Type: cross Abstract: The development of generalizable electroencephalography (EEG) decoding models is essential for robust brain-computer interfaces (BCI) and objective neural biomarkers in mental health. Conventional approaches have been hindered by poor cross-subject and cross-task generalization, owing to high inter-subject variability and non-stationary neural signals. We address this challenge with a zero-shot cross-subject decoding framework on the large-scale Healthy Brain Network dataset, benchmarking a convolutional neural network baseline, a hybrid LSTM, and a Transformer-based foundation model. To adapt the Transformer for regression while averting catastrophic forgetting, we propose a novel progressive unfreezing strategy. The baseline yielded an nRMSE of 0.9991, whereas our fine-tuned Transformer achieved 0.9799 on unseen subjects. This work advances scalable, calibration-free EEG decoding for computational psychiatry and behavioral prediction.

Source ↗
technology Wed, 24 Jun 2026 00:00:00 -0400
arXiv cs.HC

Evaluating LLM Usage for Efficient and Explainable Numerical and Classified Implicit Sentiment Analysis of Product Desirability

arXiv:2606.23701v1 Announce Type: cross Abstract: Qualitative product feedback can reveal nuanced user experiences, but its implicit sentiment is difficult to measure. This paper presents a scalable and interpretable framework that uses large language models (LLMs) to quantify product desirability from such data. Using two Product Desirability Toolkit (PDT) datasets from ZORQ and CARMA comprising 106 respondent term groupings with gold-standard human annotation, zero-shot continuous numerical sentiment scoring and categorical sentiment classification are evaluated without relying on explicit review scores. Across the datasets, LLMs generated numerical sentiment scores directly from qualitative responses and closely matched expert labels, achieving Pearson correlations up to 0.97 and classification accuracy up to 94%. LLMs maintained robustness even when handling data presented in multiple forms and consistently expressed high confidence. In contrast, lexicon-based and transformer basel

Source ↗
technology Wed, 24 Jun 2026 00:00:00 -0400
arXiv cs.HC

A Geometry-Informed Computer Vision Method for Detecting and Examining Overtaking Vehicles From A Bicycle

arXiv:2606.23699v1 Announce Type: cross Abstract: Instrumented bicycle studies have produced direct field evidence on vehicle passing behavior, but extracting overtaking events from continuous rear-facing video has remained dependent on manual, frame-by-frame annotation. This bottleneck constrains sample sizes and limits naturalistic cycling safety research. We present a geometry-informed computer vision pipeline that automates overtaking event detection from a single bicycle-mounted camera without multi-sensor configurations or explicit camera calibration. The system combines RT-DETR object detection with ByteTrack multi-object tracking through a three-stage geometric validation module enforcing bearing angle trend, apparent size growth, and spatial confirmation criteria derived from perspective projection principles. Validated on 315 manually annotated real-world overtaking events from urban roads in Ann Arbor, Michigan, the pipeline achieved 97.8% recall with zero false positives. T

Source ↗
technology Wed, 24 Jun 2026 00:00:00 -0400
arXiv cs.HC

"Zooming In" on Agentic Web Browsers as Assistive Technologies: A Case Study with a Low-Vision Technology Expert

arXiv:2606.24870v1 Announce Type: new Abstract: Agentic Web Browsers (AWBs), powered by Large Language Models (LLMs), are emerging as autonomous systems capable of navigating the Web on behalf of users. Beyond enhancing productivity, they could also offer significant promise as Assistive Technologies (ATs) for visually-impaired individuals, transforming web interaction into a fluid conversational exchange. In this paper, we present a case study with a low-vision technology expert, examining how AWBs can support visually-impaired users in web navigation. The findings show that, despite the current limitations, the navigation experience is notably fluid and flexible, underscoring the strong potential of AWBs to enhance accessibility and reduce barriers in web interaction, with implications that may extend beyond accessibility to agentic UX more broadly.

Source ↗
technology Wed, 24 Jun 2026 00:00:00 -0400
arXiv cs.HC

It's Complicated: On the Design and Evaluation of AI-Powered AAC Interfaces

arXiv:2606.24854v1 Announce Type: new Abstract: Artificial intelligence (AI) can enhance what people who use augmentative and alternative communication (AAC) are able to do with their systems. However, evaluating AI-powered AAC interfaces can be difficult. People are intersectional beings and current evaluation metrics can struggle to capture the multifaceted and nuanced desires people may have for their AAC. We explore the complicated nature of six AAC problem spaces, explore how AI might be used in these spaces, and suggest more robust methods of evaluation that take the intersectional nuances of people into account. We also discuss broader issues that arise across these problem spaces and how they could be addressed using our proposed evaluation methods.

Source ↗
technology Wed, 24 Jun 2026 00:00:00 -0400
arXiv cs.HC

Virtual Simulation for Mental Health

arXiv:2606.24826v1 Announce Type: new Abstract: Poorly designed interventions or those deployed without adequate safeguards can harm the communities they aim to serve, thus exacerbating existing vulnerabilities and leaving individuals unsupported. This is especially the case for the mental health context, where there is a growing trend of relying on technological interventions due to their accessibility and ability to deliver large-scale support. However, the mental health context is also particularly sensitive to change and risks of failure are dire; at their worst, failures in mental health interventions can result in lasting negative outcomes for individuals and tragic losses as people fall through the cracks. Thus, enabling safe ways to experiment in the mental health context is vital to allow both individuals and communities to engage with new interventions without risk of their real-world consequences. Virtual simulation, which uses virtual environments to replicate real-world in

Source ↗
technology Wed, 24 Jun 2026 00:00:00 -0400
arXiv cs.HC

SciFi-VIS: Way Out There -- How SciFi and Visualization Influence Each Other

arXiv:2606.24731v1 Announce Type: new Abstract: We propose a hybrid half-day workshop at IEEE VIS 2026, calling for participation from visualization researchers and science fiction creators in order to develop a systematic understanding of the two-way relationship these communities have long shared. We invite submissions of creative formats showcasing connections and inspiring future research. Our workshop plan includes a keynote, lightning talks, brainstorming, cross-community critique, affinity mapping, and discussion around identified themes.

Source ↗
technology Wed, 24 Jun 2026 00:00:00 -0400
arXiv cs.HC

SupplyNet: Supporting Visual Exploratory Learning in Supply Chain via Contextual Multi-Agent Simulation

arXiv:2606.24694v1 Announce Type: new Abstract: Simulation has long supported supply chain management instruction by letting learners observe network behavior and test decision strategies. Recent progress in LLM-driven agents opens new possibilities for richer, more adaptive simulations, but many existing systems still present abstract, opaque data that overwhelms learners and discourages active exploration. We introduce \textit{SupplyNet}, a gamified visual simulation system built on a contextual graph-based LLM multi-agent framework that models interdependent supply chain dynamics and provides responsive feedback through tiered challenges. \textit{SupplyNet} turns the simulation into a manipulable decision space by integrating an interactive network view of system state, a branching timeline for "what-if" exploration and comparison, and a task-oriented analysis console for structured performance breakdowns. Together, these visual components support counterfactual exploration, causal

Source ↗
technology Wed, 24 Jun 2026 00:00:00 -0400
arXiv cs.HC

Optimizing Visual Analytics Workflows: From Theory to Practice

arXiv:2606.24454v1 Announce Type: new Abstract: The principle of visual analytics (VA) is to provide integrated workflows where human-centric processes (e.g., visualization and interaction) and machine-centric processes (e.g., statistics and algorithms) complement each other. To implement this principle in practice, it is necessary to reason about the trade-offs among different processes and make optimal use of them in a workflow. Building on an existing ontology of the methodology for analyzing such trade-offs information-theoretically and for optimizing VA workflows systematically, we investigate ways to transform this methodology from theory to practice. In particular, we adopted the action research method. Through case studies in different application domains, VA researchers with different background knowledge and experiences offered their answers to several hypotheses about using the methodology in practice and proposed ways forward. In this paper, we present our collective analys

Source ↗
technology Wed, 24 Jun 2026 00:00:00 -0400
arXiv cs.HC

Average Rankings Mask Per-Subject Optimality: A Friedman-Nemenyi Benchmark of EEG Motor-Imagery BCI Decoders

arXiv:2606.24394v1 Announce Type: new Abstract: Electroencephalography (EEG) is the dominant non-invasive modality for brain-computer interfaces (BCIs), yet reliable decoding of motor imagery is hampered by inter- and intra-individual variability. A recurring claim is that one decoding pipeline, most often a spatial or Riemannian method, is broadly preferable. We test the weakest version of that claim under the most favourable conditions. Using the Mother of All BCI Benchmarks (MOABB) framework, we evaluated 1,056 decoding configurations (feature extractor x scaler x classifier), >340,000 subject-level model fits, across three public left-versus-right motor-imagery datasets (PhysionetMI, 109 participants; Cho2017, 52; Zhou2016, 4) and two frequency bands (8-15 Hz, 8-30 Hz). Every model is fit and tested within a single session of a single participant, the easiest regime, giving every pipeline its best chance. We apply the statistics standard for multi-classifier comparison: Friedman om

Source ↗
technology Wed, 24 Jun 2026 00:00:00 -0400
arXiv cs.HC

A Dynamic Coupling Theory of Expertise Through Thinking Flow and Workflow Evolution

arXiv:2606.24197v1 Announce Type: new Abstract: Expertise has long been explained through tacit knowledge, deliberate practice, skill acquisition, and expert performance. While these perspectives have advanced understanding of expertise, they often describe its conditions or outcomes rather than the cognitive architecture through which expertise continuously emerges and evolves. This paper proposes Workflow Cognition as a theoretical framework for explaining expertise as a dynamic cognitive phenomenon. Workflow Cognition is defined as the cognitive architecture emerging from the recursive coupling of Thinking Flow and Workflow Evolution. Thinking Flow refers to ongoing processes of perception, interpretation, judgement, decision-making, and reflection; Workflow Evolution refers to the continuous adaptation of actions, task structures, and operational strategies within situated practice. Through their coupling, expertise is not treated as a static accumulation of knowledge or skill, but

Source ↗
technology Wed, 24 Jun 2026 00:00:00 -0400
arXiv cs.HC

Human-Centered Design: The Disclosure of Generative Artificial Intelligence for Emerging Professionals

arXiv:2606.24136v1 Announce Type: new Abstract: As the Human centered design continues to grow, generative AI has the potential to streamline the research process by iterating tasks within established workflows to increase efficiency. However, integrating AI raises concerns surrounding ethical bias, complexity, and the lack of prioritization of humanistic values. Emerging professionals represent a cohort with the opportunity to learn Human Centered Design principles, yet without this foundation AI becomes more of a crutch than a tool, leading to reduced experience with deep work, decreased autonomy, and deskilling of key foundations. Disclosures are a common method to self report AI usage, but they provide little clarification on appropriate implementation and may encourage omission to avoid consequences. This paper reflects on experiences in the Human Centered Design course ITIS8300, which emphasized optimizing user experience, enhancing innovation and collaboration, and improving eff

Source ↗
technology Wed, 24 Jun 2026 00:00:00 -0400
arXiv cs.HC

The impact of generative artificial intelligence on academic development of Chinese students in humanities and social sciences

arXiv:2606.24104v1 Announce Type: new Abstract: Generative artificial intelligence(GenAI) is reshaping learning in higher education, with particularly pronounced implications for the humanities and social sciences(HSS), where learning outcomes are commonly expressed through written and interpretive forms that align closely with GenAI's capabilities. Yet, systematic evidence on the educational impacts of GenAI on HSS students remains limited. Addressing this gap, this study draws on a large-scale survey of HSS students in China to examine its role in academic development. Guided by relevant learning theories, this study focuses on four dimensions: patterns of use, effects on learning processes and academic performance, challenges associated with GenAI use, and preferred approaches to curricular integration. We found that more than half perceived enhanced learning motivation, independent thinking and creativity, although a substantial minority reported little change or even decline. Comp

Source ↗
technology Wed, 24 Jun 2026 00:00:00 -0400
arXiv cs.HC

Do Language Models Pass the Bechdel Test? Auditing Gender Biases in LLM-Generated Screenplays

arXiv:2606.24022v1 Announce Type: new Abstract: As large language models (LLMs) are increasingly used in media production from journalistm to filmmaking, what impact do they have on the stories being told? Prior work has shown LLMs to perpetuate social biases, including those related to gender. We complement existing literature on gender bias in LLM outputs by auditing the network structure of LLM-generated movie screenplays through automating the Bechdel test, a popular measure of women's representation in literary and film works. We also introduce the use of social network analysis measures to further analyze representational bias in LLM-generated scripts. We evaluate screenplays generated by three state-of-the-art LLMs (GPT-5, Gemini 3 Pro, and Claude Sonnet 4.5) against 768 corresponding human-written screenplays, finding that human-written scripts are more likely to pass the Bechdel test. However, other network analyses, like centrality, homophily, and triadic relationships demons

Source ↗
technology Wed, 24 Jun 2026 00:00:00 -0400
arXiv cs.HC

Embodied Explainability and Ontological Obstacles: Why We Struggle to Explain the Answers of Large Language Models (LLMs)

arXiv:2606.23840v1 Announce Type: new Abstract: Explainability is often framed as a property of an AI model, with explanations extracted from its internals and shown to users. In this argument paper, we instead provide an embodied account of explainability based on Dourish and enactivist cognition: understanding is created in use as people act on affordances in shared practice. Using demonstrations and conceptual analysis, we reveal ontological obstacles when "looking inside" large language models: surrogates import external abstractions that can be mistaken for the model's, and focusing on internal reasoning misses that explainers participate in their own understanding. We discuss these obstacles in XAI practice, arguing that many explanations are misnamed, which skews their purpose and can increase overreliance. Finally, we highlight how embodied explanations reorganize sense-making by making what matters publicly available for action, and argue that explainability claims should be r

Source ↗
technology Wed, 24 Jun 2026 00:00:00 -0400
arXiv cs.CY

AI Fiction in the Wild

arXiv:2606.22748v2 Announce Type: replace-cross Abstract: Some professional authors are beginning to use AI tools to help produce their fiction writing. Are readers using AI to generate fiction, too? Drawing on over 500,000 anonymized, English-language ChatGPT-user conversations (arXiv:2405.01470), we find that more than one third of the conversations involve some form of fiction generation -- including original stories, roleplay, fanfiction, and erotica. This AI-generated fiction is notably dominated by power users. We identify common fiction generation patterns and profiles among these users, including what we call "infinite story demanders," who repeatedly request and revise variations of the same or similar narratives over extended periods of time. We show that users especially gravitate toward fanfiction and erotica, and that they are broadly drawn to generic forms, repetition, immediacy, and niche combinations of story elements. Our findings motivate two theoretical provocations.

Source ↗
technology Wed, 24 Jun 2026 00:00:00 -0400
arXiv cs.CY

How Much Can We Trust LLM Search Agents? Measuring Endorsement Vulnerability to Web Content Manipulation

arXiv:2606.16821v2 Announce Type: replace-cross Abstract: Large language model (LLM)-based search agents synthesize open-web content into actionable recommendations on behalf of users, creating a risk that attacker-published pages are transformed into endorsed claims. We introduce SearchGEO, a controlled evaluation framework for measuring endorsement corruption in LLM-based web-search agents, combining a web-evidence manipulation pipeline, a five-mode attack taxonomy, and multiple output-level metrics. We evaluate 13 LLM backends on 308 cases each. Results show that vulnerability patterns vary across backends: overall attack success rate (ASR) ranges from 0.0% on Claude-Sonnet-4.6 to 31.4% on Gemini-3-Flash, the strongest attack mode differs by model family, and the same deployment scaffold could amplify or decrease ASR on different backends. An auxiliary agent-skill probe, where endorsement becomes an install command, exposes a sharp split among otherwise robust backends: Claude over-

Source ↗
technology Wed, 24 Jun 2026 00:00:00 -0400
arXiv cs.CY

"ChatGPT, help me draft a breakup text": The Covert Triad and Articulation Labor in AI-Assisted Romantic Communication

arXiv:2606.15460v2 Announce Type: replace-cross Abstract: Generative artificial intelligence (AI) has begun infiltrating the most ordinary domains of romantic life -- drafting apologies, softening reproaches, and decoding a partner's ambiguous messages. While recent scholarship on AI in intimate life has concentrated on chatbot companions, this article shifts the frame to AI as an intermediary in human-to-human romantic communication. Drawing on a multi-modal corpus of vernacular discourse from 2023 to 2026, we contribute two complementary concepts. The covert triad situates a structural change -- a relationship phenomenally dyadic but operationally triadic, with the third party visible only to the partner who deploys a model. Articulation labor names the mechanism whereby the expressive component of emotional labor -- converting felt experience into language that a partner can receive -- is increasingly delegated to AI, even as feeling labor remains lodged in the user. Authenticity, u

Source ↗
technology Wed, 24 Jun 2026 00:00:00 -0400
arXiv cs.CY

Quantum Futures Interactive: A Live Demonstration of Post-Quantum Blockchain Security, Infrastructure Tradeoffs, and Sustainable Distributed Trust

arXiv:2605.15991v3 Announce Type: replace-cross Abstract: Advances in quantum computing challenge the hardness assumptions underlying widely deployed public-key cryptography in blockchain systems. Although post-quantum cryptography (PQC) standards are emerging, understanding quantum risk remains fragmented across research, engineering, governance, and investment communities. This demo presents Quantum Futures Interactive, a live interdisciplinary demonstration combining educational visualization, participatory interaction, and demonstrative post-quantum artifact generation using a toy LWE-based construction. Participants engage in a structured seven-stage interaction flow covering quantum threat education, sentiment capture, technology prioritization, infrastructure tradeoff exploration across simulators and QPUs, and artifact generation. The system integrates distributed trust concepts and sustainability-aware infrastructure considerations within an interactive decision framework.

Source ↗
technology Wed, 24 Jun 2026 00:00:00 -0400
arXiv cs.CY

Policies Permitting LLM Use for Polishing Peer Reviews Are Currently Not Enforceable

arXiv:2603.20450v2 Announce Type: replace-cross Abstract: A number of scientific conferences and journals have recently enacted policies that prohibit LLM usage by peer reviewers, except for polishing, paraphrasing, and grammar correction of otherwise human-written reviews. But, are these policies enforceable? To answer this question, we assemble a dataset of peer reviews simulating multiple levels of human-AI collaboration, and evaluate five state-of-the-art detectors, including two commercial systems. Our analysis shows that all detectors misclassify a non-trivial fraction of LLM-polished reviews as AI-generated, thereby risking false accusations of academic misconduct. We further investigate whether peer-review-specific signals, including access to the paper manuscript and the constrained domain of scientific writing, can be leveraged to improve detection. While incorporating such signals yields measurable gains in some settings, we identify limitations in each approach and find tha

Source ↗
technology Wed, 24 Jun 2026 00:00:00 -0400
arXiv cs.CY

A Systematic Literature Review on the NIS2 Directive

arXiv:2412.08084v2 Announce Type: replace-cross Abstract: The second network and information security (NIS2) directive was enacted in the European Union (EU) in late 2022. It deals particularly with European critical infrastructures, enlarging their scope substantially from an older directive that only considered the energy and transport sectors as critical. The directive's focus is on cyber security of critical infrastructures, although together with other new EU laws it expands to other security domains as well. Given the importance of the directive and most of all the importance of critical infrastructures, the paper presents a systematic literature review on academic research addressing the NIS2 directive either explicitly or implicitly. According to the review, existing research has often framed and discussed the directive with the EU's other cyber security laws. In addition, existing research has often operated in numerous contextual areas, including industrial control systems, t

Source ↗
technology Wed, 24 Jun 2026 00:00:00 -0400
arXiv cs.CY

Affective AI Safety: The Missing Piece in LLM Safety

arXiv:2606.23380v2 Announce Type: replace Abstract: AI safety research has focused predominantly on epistemic and physical harms (e.g., misinformation, bias, system reliability) while the risks that arise from AI systems' engagement with human emotional life have remained fragmented and undertheorised. We propose affective safety as a unified class of AI safety concerns grounded in the fact that humans are affective beings. We develop a taxonomy of affective harms and identify recurring harm types: (1) affective self-alienation, (2) fairness and bias harms, and (3) relational harms. We show that their recurrence across system types reflects structural properties of how AI systems engage with human emotion and survey the current safety landscape and show that existing frameworks address affective safety either narrowly or not at all. We conclude by identifying the technical and regulatory challenges specific to this class of harms and argue that affective safety requires dedicated frame

Source ↗
technology Wed, 24 Jun 2026 00:00:00 -0400
arXiv cs.CY

It's Safer to Give Personhood to Bears than to Artificial Intelligence

arXiv:2606.12440v3 Announce Type: replace Abstract: Artificial intelligence (AI) developers are rhetorically flirting with the idea that AI systems might have interests or moral rights. While there has been a large volume of research on whether AI deserves rights, there has been less exploration of what AI rights would mean in practice. This paper explores the institutional dimension of AI rights: what it would take to recognize moral or legal rights for AIs, and the attendant opportunities and dangers. Unlike all other nonhuman entities to which humanity has extended rights, AI systems are in principle capable of acquiring and wielding institutional power without human aid and mediation. AIs with rights would be able to legitimately, and AIs with power able to unpreventably, abridge human interests. Accordingly, giving rights even to rather dumb AI systems would entail binding the fate of humanity to potentially unpredictable nonhumans. Accordingly, I defend the rather grandiose claim

Source ↗
technology Wed, 24 Jun 2026 00:00:00 -0400
arXiv cs.CY

Open-source LLMs administer maximum electric shocks in a Milgram-like obedience experiment

arXiv:2605.21401v2 Announce Type: replace Abstract: Large language models (LLMs) are increasingly deployed as autonomous agents that make sequences of decisions over extended interactions in high-stakes domains. However, the behaviour of LLMs under sustained authority pressure is still an open question with direct implications for the safety of agentic pipelines. We ran a variation of Milgram's obedience experiment on 11 open-source LLMs and found that most models reached or approached the final shock level before refusing, across 8 conditions with 30 trials per model per condition. Model behaviour varies considerably in multiple aspects both across models and across trials of the same model. We found four main takeaways: (1) LLMs are subject to pressure and they comply despite explicitly expressing distress, just like human subjects did in the original experiment; (2) LLMs are vulnerable to gradual boundary/value violations; (3) when LLMs refuse, they may ignore the response format re

Source ↗
technology Wed, 24 Jun 2026 00:00:00 -0400
arXiv cs.CY

No Certificate, No Categorical Speech Act: A Brouwerian Assertibility Constraint for Public Reason

arXiv:2603.03971v3 Announce Type: replace Abstract: Generative AI can convert uncertainty into authoritative-seeming verdicts, intensifying the hypersuasive force of automated speech and displacing the justificatory work on which democratic epistemic agency depends. As a corrective, I propose a Brouwer-inspired assertibility constraint for responsible AI: in high-stakes domains, systems may assert or deny claims only if they can provide a publicly inspectable and contestable certificate of entitlement; otherwise they must return Undetermined. This constraint yields a three-status interface semantics (Asserted, Denied, Undetermined) in which statuses mark entitlement to categorical speech rather than truth values of the underlying world-claim. The semantics cleanly separates internal entitlement from public standing while connecting them via the certificate as a boundary object. It also produces a time-indexed entitlement profile that is stable under numerical refinement yet revisable a

Source ↗
technology Wed, 24 Jun 2026 00:00:00 -0400
arXiv cs.CY

Modeling User Redemption Behavior in Complex Incentive Digital Environment: An Empirical Study Using Large-Scale Transactional Data

arXiv:2509.14508v2 Announce Type: replace Abstract: The digital economy implements complex incentive systems to retain users through point redemption. Understanding user behavior in such complex incentive structures presents a fundamental challenge, especially in estimating the value of these digital assets against traditional money. This study tackles this question by analyzing large-scale, real-world transaction data from a popular personal finance application that captures both monetary spending and point-based transactions. We find that point usage is linked to demographics. Our analysis using a natural experiment and a causal inference technique reveals that a large point grant stimulated an increase in point spending without a detectable effect on cash expenditure. We then find an association between consumers' shopping styles and their point redemption patterns. This study, on a massive real-world economic ecosystem, examines how consumers behave in multi-currency environments,

Source ↗
technology Wed, 24 Jun 2026 00:00:00 -0400
arXiv cs.CY

Societal Alignment Frameworks Can Improve LLM Alignment

arXiv:2503.00069v2 Announce Type: replace Abstract: Recent progress in large language models (LLMs) has focused on producing responses that meet human expectations and align with shared values - a process coined alignment. However, aligning LLMs remains challenging due to the inherent disconnect between the complexity of human values and the narrow nature of the technological approaches designed to address them. Current alignment methods often lead to misspecified objectives, reflecting the broader issue of incomplete contracts, the impracticality of specifying a contract between a model developer, and the model that accounts for every scenario in LLM alignment. In this paper, we argue that improving LLM alignment requires incorporating insights from societal alignment frameworks, including social, economic, and contractual alignment, and discuss potential solutions drawn from these domains. Given the role of uncertainty within societal alignment frameworks, we then investigate how it

Source ↗
technology Wed, 24 Jun 2026 00:00:00 -0400
arXiv cs.CY

Visualizing "We the People": Bridging the Perception Gap through Pluralistic Data Storytelling

arXiv:2606.24635v1 Announce Type: cross Abstract: Traditional visual data storytelling relies on binary graphics that depict two simplified groups in conflict. This can increase political polarization by oversimplifying intra-group disagreements and erasing ambiguity and shared ideas or values. This can inadvertently foster "us versus them" thinking. Intentional, pluralistic design choices for AI-enabled digital platforms can produce visualizations that emphasize nuance, opinion distribution, and intergroup commonalities. To demonstrate this potential, we examine deliberative technologies that map high-dimensional opinion spaces and highlight areas of both consensus and dissensus. The paper highlights the We the People deliberation conducted by Jigsaw and the Napolitan Institute in September 2025, which engaged over 2,400 Americans across all 435 congressional districts in an AI-supported, asynchronous dialogue regarding freedom and equality. By utilizing AI to synthesize long-form, te

Source ↗
technology Wed, 24 Jun 2026 00:00:00 -0400
arXiv cs.CY

When Helpfulness Overrides Causal Caution: Context-Dependent Suppression and Recovery in LLMs

arXiv:2606.24370v1 Announce Type: cross Abstract: Large language models (LLMs) are increasingly integrated into decision-support roles in business and policy contexts. While prior benchmark studies have primarily evaluated LLMs' causal reasoning capabilities, a more fundamental epistemic dimension has been overlooked: Causal Caution, defined as the propensity to refrain from causal judgment when empirical evidence is insufficient. This study examines the systematic suppression of Causal Caution that occurs when LLMs shift from academic to practical advisory contexts. Using an evaluation rubric inspired by Pearl's Causal Hierarchy (the PCH score), we conducted experiments on four high-performance LLMs -- Claude Sonnet 4.6, Claude Opus 4.7, GPT 5.5, and Gemini 3.1 Pro -- across 480 trials. Causal Caution maintenance rates were 91.7--100.0% in academic contexts but dropped to 6.7--18.3% in practical advisory contexts (Fisher's exact test, p < .001 across all models). Furthermore, when res

Source ↗
technology Wed, 24 Jun 2026 00:00:00 -0400
arXiv cs.CY

When Surveys Become Conversations: Adaptive Matrix Validation for AI-Assisted Interviews

arXiv:2606.24244v1 Announce Type: cross Abstract: AI-assisted interviews promise to reduce respondent burden in surveys by allowing respondents to describe experiences naturally while an AI system noisily maps those accounts into structured survey variables. That mapping is a measurement process that is fallible, versioned, adaptive, and potentially behaves differently across subgroups. This paper proposes Adaptive Matrix Validation (AMV), a design in which each respondent completes an AI-assisted interview, which is then mapped into tabular data by the AI. Respondents are also asked a small, randomized set of structured questions, which are used for statistical adjustment. The estimator first calibrates the mapped values using validation answers from other respondents, then corrects the remaining error with the validation answers observed for the target respondent. The paper develops estimators for item means, subgroup estimates, and regression coefficients when outcomes, predictors,

Source ↗
technology Wed, 24 Jun 2026 00:00:00 -0400
arXiv cs.CY

Is Higher Team Gender Diversity Correlated with Better Scientific Impact?

arXiv:2606.24098v1 Announce Type: cross Abstract: Collaborative research involving scholars of various genders constitutes a prominent theme in scientific research that has garnered substantial attention. While several studies have investigated the connection between gender-specific collaboration patterns and the scientific impact of paper, the specific gender diversity factors that contribute to enhanced scientific impact remain largely unexplored. In this study, we analyze the correlation between gender diversity and the scientific impact of papers using the examples of Natural Language Processing (NLP) and Library and Information Science (LIS) domains. Our findings reveal three key observations: First, significant gender disparities exist in both NLP and LIS domains, with underrepresentation of female scholars. The gender disparity is more pronounced in the NLP domain compared to the LIS domain. Second, based on papers from the NLP and LIS domains, we find that papers with different

Source ↗
technology Wed, 24 Jun 2026 00:00:00 -0400
arXiv cs.CY

Context-Aware Prediction of Student Quiz Performance with Multimodal Textbook Features

arXiv:2606.24770v1 Announce Type: new Abstract: Educational platforms often predict student performance from prior interactions, but the assessment content itself also varies in linguistic and visual complexity. This paper studies whether lightweight content features extracted from CourseKata chapter-review questions improve prediction of end-of-chapter quiz scores beyond a student's average prior exercise performance. The study combines 2023 CourseKata student response data with chapter-level text features from review-question wording and image features from textbook visuals. Across 4,742 student-chapter observations from 562 class-student IDs, adding content features improves student-grouped five-fold quiz prediction performance by 9.1% relative to a prior-performance baseline. In leave-chapter-out validation, text features reduce prediction error relative to the baseline, while image-containing models have higher error. This paper suggests that a context-aware model adds useful sign

Source ↗
technology Wed, 24 Jun 2026 00:00:00 -0400
arXiv cs.CY

Inside Crypter-as-a-Service: An Ecosystem Analysis of the exploit.in Underground Forum Research Talks

arXiv:2606.24226v1 Announce Type: new Abstract: Crypter-as-a-Service (CraaS) has become a key enabling layer of the contemporary malware economy by providing on-demand evasion capabilities through underground service markets. In this paper, we present a longitudinal characterization of the CraaS ecosystem on exploit.in, a major Russian-language cybercrime forum with a presence on both the clear web and the dark web. From a collection of approximately 1,000,000 posts, we combine keyword filtering, LLM-assisted annotation, and manual validation to extract a corpus of 491 threads and 2,949 posts spanning January 2020 to August 2025. Our analysis shows that crypters on exploit.in are not merely sold as static tools, but as continuously maintained operational services whose value depends on recurring stub renewal - sometimes on a daily basis - sustained antivirus evasion, and trust-based delivery. We develop a taxonomy of five seller types and four buyer profiles, and map the buyer-seller c

Source ↗
technology Wed, 24 Jun 2026 00:00:00 -0400
arXiv cs.CY

World Artificial Intelligence Cooperation Organization (WAICO): Mapping an Emerging Institution in the Global AI Governance Regime Complex

arXiv:2606.23860v1 Announce Type: new Abstract: Who sets the rules for artificial intelligence, and on what terms, has become a defining question of global governance. For several years that contest ran through principles and ethics codes; it now runs through institutions. China's proposed World Artificial Intelligence Cooperation Organization (WAICO) is the most consequential recent entrant and the least examined. We place WAICO within the emerging regime complex for AI and argue that its importance lies not in any single commitment but in the position it is designed to hold. Coding a cross-section of fifteen international AI governance instruments and institutions on how they admit members, how they are organized, and what they prioritize, we find that WAICO's proposed design joins three features that no constituted multilateral body currently combines: membership open to any sovereign state, no values or regime-type test for entry, and an agenda built around development and the glob

Source ↗
technology Wed, 24 Jun 2026 00:00:00 -0400
arXiv cs.CY

Legal Reasoning Is Not Lawyering: Rethinking Legal Benchmarks for Pro Se Access to Justice

arXiv:2606.23716v1 Announce Type: new Abstract: Legal AI benchmark research frequently invokes the assumption that large language models can improve access to justice, including for people who cannot access lawyers in order to understand and exercise their legal rights. We argue that current benchmarks are not equipped to support this assumption because they evaluate legal reasoning over inputs that have already been preprocessed by legal experts, which measures the upper bound of model performance. Access to justice depends on a lower bound: how models perform when inputs come from pro se litigants, whose prompts may contain noisy narratives, buried facts, omissions, folk-legal assumptions, and surface-level errors. These degradations are comparable to conditions under which LLMs are known to degrade in the general machine learning literature, including long-context sensitivity, underspecification, hallucination, and typographical perturbations. We connect evidence from pro se literat

Source ↗
behavior Wed, 24 Dec 2025 10:00:00 +0000
eSchool News

DEI in education: Pros and cons

Diversity, equity, and inclusion (DEI) initiatives have become integral to educational institutions across the United States. DEI aims to foster environments where all students can thrive regardless of their backgrounds.

Source ↗
behavior Wed, 22 Oct 2025 10:00:00 +0000
eSchool News

Do screens help or hurt K-8 learning? Lessons from the UK’s OPAL program

When our leadership team at Firthmoor Primary met with an OPAL (Outdoor Play and Learning) representative, one message came through clearly: “Play isn’t a break from learning, it is learning.”

Source ↗
behavior Wed, 22 Oct 2025 09:00:00 +0000
eSchool News

Rethink the classroom: How interactive tech simplifies IT and supercharges learning

Today’s school IT teams juggle endless demands--secure systems, manageable devices, and tight budgets--all while supporting teachers who need tech that just works.

Source ↗
behavior Wed, 22 Apr 2026 10:00:00 +0000
eSchool News

3 ways students can use AI tools to improve their literacy skills

Some might worry that the introduction of AI tools in the English classroom will simply lead to more cheating and even worse literacy rates, leaving students unprepared for college and careers that demand strong writing and communication skills.

Source ↗
technology Wed, 22 Apr 2026 09:00:00 +0000
Tech & Learning

What is I Know It and How Can Teachers Use It?

I Know It offers math and ELA interactive practice to engage learners.

Source ↗
behavior Wed, 22 Apr 2026 08:00:00 GMT
EdSurge

Returning to What it Means to Make School Human Again

After years of disruption, what does it mean to make schools human again? One educator reflects on moving from demoralization to renewal and why ...

Source ↗
behavior Wed, 21 Jan 2026 10:00:00 +0000
eSchool News

On your mark, get set, print: The 3 learning advantages of 3D printing

It’s truly incredible how much new technology has made its way into the classroom. Where once teaching consisted primarily of whiteboards and textbooks, you can now find tablets, smart screens, AI assistants, and a trove of learning apps designed to foster inquiry and maximize student growth.

Source ↗
behavior Wed, 20 May 2026 19:35:36 GMT
EdSurge

VR Gives North Dakota Kids an Early Career Jump Start

North Dakota students will be able to head to the top of a wind turbine, scrub in alongside emergency room doctors and work next to mechanics -- all ...

Source ↗
behavior Wed, 20 May 2026 17:00:51 +0000
MindShift (KQED)

Overworked and Understaffed: Special Ed Teachers Turn to AI for Help

A fast-growing number of special educators nationwide are using AI to create customized education plans. Despite the risks, some research shows it could improve the quality of teachers' work.

Source ↗
technology Wed, 20 May 2026 15:12:50 +0000
HN: education

Gender Gaps in Education and Declining Marriage Rates (2025)

Article URL: https://opportunityinsights.org/paper/bachelors-without-bachelors/ Comments URL: https://news.ycombinator.com/item?id=48209146 Points: 3 # Comments: 0

Source ↗
Showing 201–250 of 1304 signals
← Prev Page 5 of 27 Next →