Educational emergencies are key situations capable of redefining the way society organizes its respective educational systems. The research analyzes the influence of management teams on the development of teacher agency during emergency remote teaching in three schools from different socioeconomic and geographic contexts. Eighteen teachers and six administrators were interviewed, using the teaching agency framework, grounded theory, and thematic analysis to process the data. The results show that the decisions made by administrators during emergency remote teaching allowed for the development of teacheragency in three main dimensions: challenges of teacher adaptation, support from educational management, and improvement in teachers' perceptions of educational management. The study concludes that support from administrators and freedom in curricular decision-making by teachers are key factors in facilitating the development of teacher agency and addressing the educational crisis. The study is novel in its use of the ecological approach to teacher agency as a theoretical framework and in its retrospective analysis of social and educational crises.

Fuzzy Cognitive Maps (FCMs) are a type of recurrent neural network with built-in meaning in their architecture, originally devoted to modeling and scenario simulation tasks. These knowledge-based neural systems support feedback loops that handle static and temporal data. Over the last decade, there has been a noticeable increase in the number of contributions dedicated to developing FCM-based models and algorithms for structured pattern classification and time series forecasting. These models are attractive since they have proven competitive compared to black boxes while providing highly desirable interpretability features. Equally important are the theoretical studies that have significantly advanced our understanding of the convergence behavior and approximation capabilities of FCM-based models. These studies can challenge individuals who are not experts in Mathematics or Computer Science. As a result, we can occasionally find flawed FCM studies that fail to benefit from the theoretical progress experienced by the field. To address all these challenges, this survey paper aims to cover relevant theoretical and algorithmic advances in the field, while providing clear interpretations and practical pointers for both practitioners and researchers. Additionally, we will survey existing tools and software implementations, highlighting their strengths and limitations towards developing FCM-based solutions.

Numerous datasets have been proposed to evaluate social bias in Natural Language Processing (NLP) systems. However, assessing bias within specific application domains remains challenging, as existing approaches often face limitations in scalability and fidelity across domains. In this work, we introduce a domain-adaptive framework that utilizes prompting with Large Language Models (LLMs) to automatically transform template-based bias datasets into domain-specific variants. We apply our method to two widely used benchmarks—Equity Evaluation Corpus (EEC) and Identity Phrase Templates Test Set (IPTTS)—adapting them to the Twitter and Wikipedia Talk data. Our results show that the adapted datasets yield bias estimates more closely aligned with real-world data. These findings highlight the potential of LLM-based prompting to enhance the realism and contextual relevance of bias evaluation in NLP systems.

Driver somnolence remains a major challenge for road safety, not only for its detection, but especially for forecasting when drowsiness will impair driving performance. To address this matter, various physiological signals and facial images are employed to identify signs of sleepiness. However, predicting the driver’s drowsiness condition within a few minutes earlier is more complex than classifying their current status. This study introduces a novel forecasting method based on BiLSTM (Bidirectional Long-Short-Term Memory) to predict when a driver will reach a predefined drowsiness threshold within a seven-minute window. A set of non-intrusive sensors, including force-sensing resistors (FSR) and vehicle measurements (Telemetry data), alongside physiological data (EEG, ECG, EMG), is employed to detect and forecast the upcoming drowsy events. Moreover, a combination of drowsiness detectors based on regression models and a ResNet architecture was implemented to evaluate the performance of these models. This multimodal database was collected from 30 volunteer drivers in a controlled virtual driving environment using a driving simulator in three different scenarios. The results of this study allow evaluation of whether the performance of the BiLSTM model is enhanced when compared to non-intrusive sensor data. In comparison to existing classification-based approaches, the proposed BiLSTM forecasting model demonstrated superior predictive outcomes, reducing classification error rates and improving accuracy in forecasting drowsiness events. This improvement highlights the advantage of integrating regression-based detection with time-series forecasting, thereby enhancing the reliability of driver monitoring systems. Furthermore, the best regression model achieved a test accuracy of 0.964, while the best-performing forecasting model scored 0.86 on the same metric. Notably, the entirely non-intrusive FSR alternative achieves a promising detection accuracy of 0.905. These findings demonstrate the feasibility of using time-series data, non-intrusive sensors, and a forecasting technique to predict upcoming drowsiness events, enabling a practical alternative for continuously monitoring the drowsiness status of drivers.

Pre-service teachers can play a crucial role in integrating AI-based tools into the new educational landscape. However, there is a need to validate specialized instruments, apply current conceptualizations such as intelligent-TPACK, and address ethical issues, as pre-service teachers are often overlooked in the development of tools for AI integration. To address these gaps, we adapted a previously existing instrument designed for in-service teachers to measure pre-service teachers’ integration of AI within their training context. We conducted a quantitative cross-sectional survey with a total of 366 pre-service teachers to evaluate the adapted intelligent-TPACK instrument and examine participants' demographic characteristics related to the framework dimensions. Data analysis included a Confirmatory Factor Analysis to assess the factor model of the adapted instrument, followed by correlations to compare participant variables such as gender, type of university, and stage in the training program with the Intelligent-TPACK model factors. To investigate the differences among groups, the nonparametric ANCOVA test (Quade test) was utilized, enabling the control of covariates like age and academic progress level to ensure comparability across the dimensions of the Intelligent-TPACK model. Findings reveal a high fit of the Intelligent-TPACK model for pre-service teachers (CFI=0.997; TLI=0.997). The data also shows statistically significant effects related to academic progress level and type of institution, while factors -gender, geographic location, and type of major- did not demonstrate noteworthy differences. These results highlight key areas for future curriculum development and support for pre-service teachers in integrating AI education.

Medical vision-language models can automate the generation of radiology reports but struggle with accurate visual grounding and factual consistency. Existing models often misalign textual findings with visual evidence, leading to unreliable or weakly grounded predictions. We present CURE, an error-aware curriculum learning framework that improves grounding and report quality without any additional data. CURE fine-tunes a multimodal instructional model on phrase grounding, grounded report generation, and anatomy-grounded report generation using public datasets. The method dynamically adjusts sampling based on model performance, emphasizing harder samples to improve spatial and textual alignment. CURE improves grounding accuracy by +0.37 IoU, boosts report quality by +0.188 CXRFEScore, and reduces hallucinations by 18.6%. CURE is a data-efficient framework that enhances both grounding accuracy and report reliability. Code is available at this https URL and model weights at this https URL

We analyze the long term behavior of hyperbolic neural networks through subhomogeneous layer maps, focusing on stability, growth control, and robustness under stochastic perturbations. This work unifies the standard hyperbolic models via explicit isometries and Möbius operations, allowing statements to be transported across representations without loss of geometric meaning. Within this model invariant view, we study iterated, noise perturbed transformations and develop an ergodic theoretic framework that characterizes their asymptotic behavior, including conditions that promote stability and convergence of averaged iterates. Beyond theory, these insights inform practical design choices for training procedures that remain well-behaved in the presence of noise and avoid unbounded parameter growth, thereby supporting more reliable use of hyperbolic representations in hierarchical and graph structured learning tasks.

Large language models (LLMs) such as GPT-4o have the potential to transform clinical decision-making, patient education, and medical research. Despite impressive performance in generating patient-friendly educational materials and assisting in clinical documentation, concerns remain regarding the reliability, subtle errors, and biases that can undermine their use in high-stakes medical settings. A multi-phase experimental design was employed to assess the performance of GPT-4o on the Chilean anesthesiology exam (CONACEM), which comprised 183 questions covering four cognitive domains—Understanding, Recall, Application, and Analysis—based on Bloom’s taxonomy. Thirty independent simulation runs were conducted with systematic variation of the model’s temperature parameter to gauge the balance between deterministic and creative responses. The generated responses underwent qualitative error analysis using a refined taxonomy that categorized errors such as “Unsupported Medical Claim,” “Hallucination of Information,” “Sticking with Wrong Diagnosis,” “Non-medical Factual Error,” “Incorrect Understanding of Task,” “Reasonable Response,” “Ignore Missing Information,” and “Incorrect or Vague Conclusion.” Two board-certified anesthesiologists performed independent annotations, with disagreements resolved by a third expert. Statistical evaluations—including one-way ANOVA, non-parametric tests, chi-square, and linear mixed-effects modeling—were used to compare performance across domains and analyze error frequency. GPT-4o achieved an overall accuracy of 83.69%. Performance varied significantly by cognitive domain, with the highest accuracy observed in the Understanding (90.10%) and Recall (84.38%) domains, and lower accuracy in Application (76.83%) and Analysis (76.54%). Among the 120 incorrect responses, unsupported medical claims were the most common error (40.69%), followed by vague or incorrect conclusions (22.07%). Co-occurrence analyses revealed that unsupported claims often appeared alongside imprecise conclusions, highlighting a trend of compounded errors particularly in tasks requiring complex reasoning. Inter-rater reliability for error annotation was robust, with a mean Cohen’s kappa of 0.73. While GPT-4o exhibits strengths in factual recall and comprehension, its limitations in handling higher-order reasoning and diagnostic judgment are evident through frequent unsupported medical claims and vague conclusions. These findings underscore the need for improved domain-specific fine-tuning, enhanced error mitigation strategies, and integrated knowledge verification mechanisms prior to clinical deployment.

Recent investigations have shown that the tympanic membranes exhibit synchronous oscillations with each saccadic eye movement (Gruters et al., 2018), a phenomenon known as eye movement-related eardrum oscillations (EMREOs). However, the dependence of these saccade-associated EMREOs on ongoing visual activity remains to be elucidated. Given the direct projections from motor areas to primary auditory and visual cortices and the observation that EMREOs’ onset occurs concurrently with, or even precedes, saccades, we hypothesized that EMREOs would persist in the absence of visual stimulation. This report presents a study wherein 16 healthy male and female participants executed horizontal saccades under three distinct conditions: (1) in a well-lit environment, (2) in a darkened environment with eyes open, and (3) in a darkened environment with eyes closed. Ocular movements were quantified via electrooculography, and tympanic membrane oscillations were registered using in-ear microphones. The results demonstrated the presence of EMREOs concurrent with both visually guided and memory-guided saccades, although a late minor reduction in amplitude was observed in the “dark with open eyes” condition. Significant attenuation of EMREOs was evident when participants performed saccades with their eyelids closed, despite maintaining the same saccade amplitude and initial velocity. This amplitude reduction may reflect modulations in cortical states associated with predictive coding.

In today’s digital economy, where personalization has become a cornerstone of effective marketing strategies, companies face the dual challenge of increasing advertising impact while safeguarding sensitive customer information. Despite the rapid progress of large language models (LLMs), existing commercial solutions often neglect the integration of synthetic data to reduce privacy risks and enhance adaptability, leaving organizations dependent on external providers. To address this gap, our work fine-tunes open-source LLMs (LlaMa2, Mistral, and Zephyr) with synthetic datasets generated via GPT, aiming to produce customized marketing emails tailored to demographic and behavioral features. This thesis demonstrates not only the feasibility but also the competitiveness of such models by evaluating outputs with standard metrics (BLEU, ROUGE) and human-like scoring through GPT-4, showing that open-source models can approximate the performance of proprietary alternatives at significantly lower cost. The results confirm that fine-tuned LLMs with synthetic data represent a viable solution for enterprises seeking efficiency, personalization, and internal control of data.

agencia nacional de investigación y desarrollo
Edificio de Innovación UC, Piso 2
Vicuña Mackenna 4860
Macul, Chile