We study basic human decision making and learning processes when making repeated and/or sequential choice. Understanding the basic processes in these very common settings (e.g. driving, behavior in pandemics, using smartphone apps, health decisions) both improves our ability to predict behavior and to design mechanisms and policies that are robust to the likely behaviors of systems’ users.
We integrate psychological theories and models of human decision making into machine learning systems to predict human decision making in state-of-the-art levels. Focusing on the most fundamental choice task from behavioral economics and using the largest datasets currently available, we study which theories and models, which types of machine learning algorithms and tools, and which methods of integration lead to the best out-of-sample predictions.
Vaccinations are considered the major tool to curb the current SARS-CoV-2 pandemic. A randomized placebo-controlled trial of the BNT162b2 vaccine has demonstrated a 95% efficacy in preventing COVID-19 disease. These results are now corroborated with statistical analyses of real-world vaccination rollouts, but resolving vaccine effectiveness across demographic groups is challenging. Here, applying a multivariable logistic regression analysis approach to a large patient-level dataset, including SARS-CoV-2 tests, vaccine inoculations and personalized demographics, we model vaccine effectiveness at daily resolution and its interaction with sex, age and comorbidities. Vaccine effectiveness gradually increased post day 12 of inoculation, then plateaued, around 35 days, reaching 91.2% [CI 88.8%-93.1%] for all infections and 99.3% [CI 95.3%-99.9%] for symptomatic infections. Effectiveness was uniform for men and women yet declined mildly but significantly with age and for patients with specific chronic comorbidities, most notably type 2 diabetes. Quantifying real-world vaccine effectiveness, including both biological and behavioral effects, our analysis provides initial measurement of vaccine effectiveness across demographic groups.
Antibiotic resistance is prevalent among the bacterial pathogens causing urinary tract infections. However, antimicrobial treatment is often prescribed ‘empirically’, in the absence of antibiotic susceptibility testing, risking mismatched and therefore ineffective treatment. Here, linking a 10-year longitudinal data set of over 700,000 community-acquired urinary tract infections with over 5,000,000 individually resolved records of antibiotic purchases, we identify strong associations of antibiotic resistance with the demographics, records of past urine cultures and history of drug purchases of the patients. When combined together, these associations allow for machine-learning-based personalized drug-specific predictions of antibiotic resistance, thereby enabling drug-prescribing algorithms that match an antibiotic treatment recommendation to the expected resistance of each sample. Applying these algorithms retrospectively, over a 1-year test period, we find that they greatly reduce the risk of mismatched treatment compared with the current standard of care. The clinical application of such algorithms may help improve the effectiveness of antimicrobial treatments.
Mass vaccination has the potential to curb the current COVID19 pandemic by protecting individuals who have been vaccinated against the disease and possibly lowering the likelihood of transmission to individuals who have not been vaccinated. The high effectiveness of the widely administered BNT162b vaccine from Pfizer–BioNTech in preventing not only the disease but also infection with SARS-CoV-2 suggests a potential for a population-level effect, which is critical for disease eradication. However, this putative effect is difficult to observe, especially in light of highly fluctuating spatiotemporal epidemic dynamics. Here, by analyzing vaccination records and test results collected during the rapid vaccine rollout in a large population from 177 geographically defined communities, we find that the rates of vaccination in each community are associated with a substantial later decline in infections among a cohort of individuals aged under 16 years, who are unvaccinated. On average, for each 20 percentage points of individuals who are vaccinated in a given population, the positive test fraction for the unvaccinated population decreased approximately twofold. These results provide observational evidence that vaccination not only protects individuals who have been vaccinated but also provides cross-protection to unvaccinated individuals in the community.
Understanding how to deal with model uncertainty is key for building resilient agents that can overcome environments that are unforeseen. My research group has studied for years different approaches that build robust agents that can cope with different types of uncertainties. Robustness means that policies are immune to changes in the environment leading to better real time performance. In a sequence of papers we developed robust reinforcement learning and planning algorithms including scaling up such algorithms, learning the uncertainty set online, adapting quickly to unknown uncertainties, and online adaptation. The main application areas here are energy and transport services.
We consider a reinforcement learning scheme for selecting how and what to transfer in 5G networks. The problem at hand is to decide which bit-rate to use and which channels would yield the best tradeoff in terms of power, performance, and cost. We employ multi-objective, multi-agent reinforcement learning to best decide how to transmit the data. In previous work, we proposed to use multi-armed bandit algorithms that ignore the current channel and agent state (see O. Avner and S. Mannor, Multi-User Communication Networks: A Coordinated Multi-Armed Bandit Approach, IEEE/ACM Transactions on Networking ( Volume: 27, Issue: 6, Dec. 2019), https://ieeexplore.ieee.org/document/8875003), but in this project we go further and consider the state of the transmission, the real time requirements, and the changing channel.
We consider the potential role of language as a regularizer in reinforcement learning. The objective is to create hierarchical reinforcement learning algorithms that are explainable by design: they use language to describe what they do. The language models can be learned, dictated, imitated, or created. In a paper that appeared in ICML 2019, we introduced Act2Vec, a general framework for learning context-based action representation for Reinforcement Learning. Representing actions in a vector space help reinforcement learning algorithms achieve better performance by grouping similar actions and utilizing relations between different actions. We showed how prior knowledge of an environment can be extracted from demonstrations and injected into action vector representations that encode natural compatible behavior. We then used these for augmenting state representations as well as improving function approximation of Q-values. We visualize and test action embeddings in three domains including a drawing task, a high dimensional navigation task, and the large action space domain of StarCraft II.
Caching is one of the most effective performance boosting techniques, in which hot data items are stored in a closer and faster memory to the application than the entire storage. In software managed caches, the cache is typically the local DRAM memory vs. SDDs, HDDs, or remote storage. The W-TinyLFU scheme for maintaining software caches is now dominating the Java and Go eco-systems. It is applied, either directly or through the Caffeine and Ristretto caching libraries in Cassandra, Accumulo, HBase, Apache Solr, Infinispan, Open-Whisk, Corfu, Finagle, Spring, Akka, Neo4j, DGraph, Druid, and many others. We continue to expand it to new domains.
Learning a new skill requires assimilating into our brain the regularities of the external world and how our body interacts with them as we engage in this skill. Mechanistically, this entails a translation of inputs, rules, and outputs into changes to the structure of neural networks in our brain. How this translation occurs is still largely unknown. We will follow the process of this assimilation using Trained Recurrent Neural Networks (TRNNs), which are increasingly used as models of neural circuits of trained animals.
Cancer cells embedded in healthy tissue can revert to normal cells, and vice versa for healthy tissue in a tumor environment. This highlights two parallel learning processes: cell and tissue, in the development or suppression of disease. Cancer cells use their intrinsic dynamic plasticity to escape and explore novel. Simultaneously, tissue homeostasis is a target of the collective of cells forming the tissue, which oppresses this exploration and keeps cell type stable. We use the language of machine learning to characterize these two learning processes.
Training Machine learning algorithms often introduces the phenomenon of underspecification: A wide gap between the dataset used for training and the real task. A parallel phenomenon in Neuroscience is the variety of strategies with which animals can approach a given task. These observations imply that for every task and training set there exists a space of solutions that is equivalent on that set. Both the structure of this space and the rules of motion within it are not understood. In this work, we study the space of solutions that emerges from those degrees of freedom in Recurrent Neural Networks (RNNs) trained on neuroscience-inspired tasks.
Bilevel optimization problems arise in many ML and signal processing applications, where the aim is to find the minimal norm or most sparse optimal solution of an underdetermined optimization problem. Traditionally, these problems have been solved by regularization which requires tuning of the regularization parameter. We are focus on an alternative approach which utilizes first order optimization methods to directly solve this problem, for which we provide rate of convergence guarantees.
Multi-stage linear stochastic optimization problems are known to be challenging. An added difficulty arises when the distribution of the uncertainty is not known exactly, and alternatively only historical sample paths of the problem are available. We explore solving this problem by using data-driven distributionally robust optimization, for which we provide convergence guarantees. Additionally, we explore solving the resulting optimization problem by approximation methods.
Modern recommendation platforms have become complex, dynamic eco-systems. Platforms often rely on machine learning models to successfully match users to content, but most methods neglect to account for how they affect user behavior, satisfaction, and well-being of over time. Here we propose a novel dynamical-systems perspective to recommendation that allows to reason about, and control, macro-temporal aspects of recommendation policies as they relate to user behavior.
The task of optimizing machines to support human decision-making is often conflated with that of optimizing machines for accuracy, even though they are materially different. Whereas typical learning systems prescribe actions through prediction, our framework learns to to reframe problems in a way that directly supports human decisions. Using a novel human-in-the-loop training procedure, our framework learns problem representations that directly optimize human performance.
Machine learning has become imperative for informing decisions that affect the lives of humans across a multitude of domains. But when people benefit from certain predictive outcomes, they are prone to act strategically to improve those outcomes. Our goal in this project is to develop a practical learning framework that accounts for how humans behaviourally respond to classification rules. Our framework provides robustness while also providing means to promote favourable social outcomes.
When using machine learning algorithms, it is often assumed that the data is complete. In real-life applications, however, this assumption is usually over-optimistic. “Missingness” can happen in many ways: some missing covariates, some missing responses, only a lower bound is given for the response (i.e., the response is right censored), observations are seen only if they crossed some level (i.e., left truncation), or a label is given only to a bag of observations. We develop machine learning tools that can handle missing data, using imputation, inverse probability weighting, and doubly-robust estimators.
Data scientists are interested in answering questions such as how confident one is in a prediction, and whether a certain feature has a significant influence on the response variable. Drawing statistical inference for machine learning algorithms is difficult. We study methods for performing statistical inference for two common machine learning techniques: kernel machines and deep learning. We utilize Bayesian methods to quantify uncertainty, select hyper-parameter values, and to bound the generalization error. We propose novel PAC-Bayes generalization bounds which can be data-dependent.
To help policymakers set policy based on scientific methods, we use mathematical modeling and advanced statistical tools to study different aspects of the COVID-19 pandemic. Our research includes learning the susceptibility and infectivity of children and adolescents; the protection of vaccination and previous SARS-CoV-2 infection in preventing subsequent SARS-CoV-2 infection and other COVID-19 outcomes; and the effect of COVID-19 on different aspects of public health, such as suicide rate and natural abortion.
This project will enable unreliable edge computing nodes to jointly provide a reliable storage service for unpredictable user workloads. Edge systems consists small-scale servers (nodes) at the edge of the network whose root is in the cloud-based datacenter. Their premise is to bring data and computing closer to time-critical applications running on e.g., cellphones and autonomous vehicles. We combine storage redundancy schemes with scalable algorithms for object mapping and request scheduling.
Modern stochastic optimization methods often rely on uniform sampling which is agnostic to the underlying characteristics of the data. This might degrade the convergence by yielding estimates that suffer from a high variance. A possible remedy is to employ non-uniform importance sampling techniques, which take the structure of the dataset into account. In this work, we investigate a recently proposed setting which poses variance reduction as an online optimization problem with bandit feedback. We devise a novel and efficient algorithm for this setting that finds a sequence of importance sampling distributions competitive with the best fixed distribution in hindsight, the first result of this kind. While we present our method for sampling data points, it naturally extends to selecting coordinates or even blocks of thereof. Empirical validations underline the benefits of our method in several settings.
We are building theoretical and practical models that take as input both a mechanistic world model (for example and ordinary differential equation describing the cardio-vascular system) and data (for example ICU patient vital signs). The goal is to get the best of both worlds: the robustness, interpretability, and causal grounding of mechanistic models, together with the flexibility of black-box deep learning models.
Big data sources have been used extensively to analyze people’s travel patterns. This project breaks new ground by using big data on travel patterns to identify the incidence and severity of travel problems – defined here as any difficulty a person may experience in reaching desired destinations. Relying on a large-scale app-based mobility survey, data will be extracted on individual’s trip rates, travel horizons, trip speeds, and more, with the aim to detect individuals particularly likely to experience severe travel problems.
Immunotherapy has revolutionized cancer therapy, leading to the 2018 Nobel Prize in Physiology and Medicine. However, despite the dramatic response observed in several cancer types, many patients do not benefit from this treatment or relapse in a relatively short time. To improve our understanding of patient response we utilize single-cell RNA-seq data to characterize the tumor’s microenvironment, identify biomarkers of response and predict novel drug targets.
The use of immunotherapy for solid tumors has expanded dramatically with the development of checkpoint blockade therapy. Despite the unprecedented responses observed in different tumor types, many patients are refractory to therapy or acquire resistance. Growing evidence shows that the metabolic requirements of immune cells in the tumor microenvironment greatly influence the success of therapy. Here we use genomic and metabolic modeling analysis to reveal the metabolic dependencies between tumor and immune cells and identify perturbations that can increase immune activity.
Pancreatic cancer is the most aggressive form of human malignancies, with only 6% 5-year survival rate. Recently, it was found that a subgroup of patients carry mutations in the homologous recombination (HR) genes BRCA1 or BRCA2 and these tumors are sensitive to PARP inhibitor. However, response rates are infrequent and the subset of patients suitable for the treatment is limited. Here we use genomic data to computationally identify molecular signatures of response to be used as biomarkers, and aim to increase the number of patients that can benefit from the treatment.
In this project we developed and validated a new sentiment analysis engine for conversational data, called CustSent, in collaboration with LivePerson Inc.
We then developed the novel concept of emotional load – the load that employees must bear due to the emotional strain inherent in the service interactions in which they engage. Using contact center and healthcare data we investigate the impact of Emotional Load on agents and the progression of the service interaction.
We investigate how the transparency of the medical process and wait time information influence ED patients. In collaboration with Clalit Health Services, we developed a web-based app that delivers information to ED patients through their mobile phones. The development combines methods of process mining, queueing theory, and human-centered UX design. The system operates at Carmel Medical Center. Our research examines the impact of information transparency on ED efficiency and patient behavior.
Contact centers (CS) are considered the future of service delivery, offering service via texting, social media, and apps. These provide companies with unique opportunities, such as providing service proactively only to the customers that need it the most, but are also prone to new operational challenges, such as concurrency management and information uncertainty. CS data allow us to investigate the dynamics of service production and the behaviors of customers and agents. In a series of projects, we create new service models for CS and control policies for those systems.
he aim of the project is to develop a new methodology for deciphering the human factor in illuminance-related building operation by taking advantage of recent developments in commercial building automation systems and the increasing prevalence of digital control systems for shading operation. The project involves the analysis of a large-scale dataset of long-term roller blinds operation in a multi-story office building in Tel Aviv, reflecting user preferences on indoor lighting conditions.
The aim of the project is to address an existing gap in the evaluation and modelling of urban microclimates, their effects on human thermal stress and perception, and the application of scientific data in urban planning processes. This is achieved through the creation of a single computational data collection and analysis platform that integrates biophysical comfort indices and urban-scale physical, climatic, and pedestrian mapping.
(led by Prof. David Pearlmutter, Ben Gurion University of the Negev)
The aim of the project is to develop a new methodology for evaluating microclimatic summer conditions across an entire city, focusing on the provision of outdoor shade as a primary comfort indicator. Based on high-resolution 2.5D mapping of buildings, ground, and tree canopies, we employ detailed calculation of solar exposure at street level and propose the use of a summer Shade Index as a quantifiable factor for revealing a city’s hierarchy of microclimatic qualities.
We study the problem of computing embeddings tuples of a relational database in a manner that is extensible to dynamic changes of the database. Importantly, the embedding of existing tuples should not change due to the embedding of newly inserted tuples (as database applications might rely on existing embeddings), while the embedding of all tuples, old and new, should retain high quality. Our preliminary solutions show promising results relative to the alternatives, consistently and often considerably.
How should we quantify the amount of inconsistency in the database?
Proper inconsistency measures are important for various tasks, such as progress indication and action prioritization in data cleaning, and reliability estimation for datasets. We investigate a collection of basic measures in both the Knowledge Representation and Database communities, analyze their theoretical properties, and empirically observe their behavior in an experimental study. We demonstrate how the framework can lead to new inconsistency measures by introducing a new measure that satisfies all of the properties we consider and can be computed efficiently.
We are interested in developing intelligent systems that support students’ learning. One project develops “invention activities” for students learning data science, supported by automatic feedback mechanisms. This approach aims to facilitate improved understanding of data science concepts by letting students invent and test quantitative measures. In a second project, we are developing an intelligent system for supporting student collaboration on joint project. We are designing algorithms for analyzing students’ and design interfaces that will provide collaborators with actionable information regarding the group’s progress.
Understanding the capabilities and limitations of agents is important for users, as they need to choose between different agents, adjust the level of autonomy of an agent, or work alongside an agent. While prior work in explainable AI has developed methods for explaining individual decisions of an agent to a person retrospectively, these approaches do not provide users with a global understanding of an agent’s expected behavior in a range of situations. We are developing explanation methods for reinforcement learning agents.
Precision agriculture (PA) concept is based on observing, measuring and responding to inter and intra-field variability in crops or livestock. The goal is to facilitate a decision support system (DSS) for whole farm management with the goal of optimizing returns on inputs while preserving resources. Among these many approaches we focus on three specific applications: precise irrigation, early crops disease detection and early detection of pain in dairy cows.
This research consists of development and validation of effective, reliable and applicable algorithms for early detection (ED) of contaminations in drinking water (DW) from one or more sources, using data from WQ sensors. Specifically, anomaly detection in UV-absorbance spectra as means for contamination detection is presented. An additional ED algorithm, has also been developed, utilizing WQ measurements of standard physicochemical parameters. The algorithm’s high performance, together with its simplicity, adjustability, ease of implementation and low computational complexity – make it a valuable addition to water monitoring systems. Testing the performance of the two ED algorithms showed that processing physicochemical WQ measurements to detect anomalies, can serve as effective EDSs’ for DW contaminations.
Recent developments in sensory and communication technologies have made low-cost, micro-sensing units (MSUs) feasible. These MSUs can operate as a set of individual nodes, or may be interconnected to form a Wireless Distributed Environmental Sensor Network (WDESN). MSU’s lower power consumption and small size enable many new applications, such as mobile sensing. MSUs’ main limitation is their relatively low accuracy, with respect to laboratory equipment or an AQM station. In this project we examine algorithms for assessing these sensors in field operations, as well as autonomous calibration and error concealment, optimal placement of the sensors and the utilization of the mobile sensors in the process, and advanced algorithms for data analysis provide a comprehensive toolset for atmospheric data analysis.
The research overarching goal is to investigate recommendation tasks from a probabilistic perspective. We aim to confront directly with the data uncertainty as part of the recommendation process and to propose new probabilistic ranking techniques for various recommendation tasks. We look for new semantics, evaluation measures and efficient processing methods suitable to various recommendation tasks, towards designing general framework for generating high-quality recommendation.
Goal recognition design is a problem, in which we take a domain theory and a set of goals and ask:
1) to what extent do the actions performed by an agent within the model reveal its objective, and 2) what is the best way to modify a model so that any agent acting in the model reveals its objective early on. As a first stage, Goal Recognition Design finds the Worst Case Distinctiveness (wcd) of a model and as a second stage, after finding the wcd of a model, we aim at minimizing it.
It is known that the hippocampus contain place cells, responsible for coding the position of the animal in the environment. We record of data of hundreds of cells simultaneously using calcium imaging in freely foraging mice, and thus we have an opportunity to analyze the network properties and dynamics of hippocampal place cells during foraging and other behavioral tasks.
This project focuses on methods that smartly exploit the special structure of the constraint set (as a solution set of another optimization problem) and involves explicit operations for solving bi-level optimization problems. Among several theoretical results, we have provided in recent papers, the convergence rate result of the sequence of function values is special since it is the first of its kind. This area of research is thriving for new algorithms for tackling various bi-level problems.
In this project, we address a structured deep learning optimization problems, which are given by the sum of non-convex and non-smooth functions. As an example, we study a particular case of structure where the non-smoothness is represented as the maximum of non-convex smooth functions. Recently, for the structure of maximum, we have developed, the Stochastic Proximal Linear Method (SPLM) that is guaranteed to reach a critical point of this learning objective and analyze its convergence rate.
A randomized controlled trial of 20 intervention clinics and 20 usual-care control clinics to establish the value (better health? Better use of resources?) of implementing precision medicine tools into primary clinical practice. Intervention includes testing of DNA with different level platforms (from NGS panels, to GWAS, WES and WGS), of microbiome, use of wearable devices/sensors. The adult population of the study clinics includes some 140,000 people and if enough resources will be obtained, the study is expected to reach some 100,000 participants. Current resources allowed us to break ground in one clinic with 1,660 people already signed a consent. Study is National IRB approved.
GWAS-based study of >10,000 Israelis of various ethnicities serving among other purposes to establish an ethnic-specific (Jews/Arabs, Ashkenazi/Sephardi) atlas of frequencies of pharmacogenetic variants. Identify new associations between medication use in this cohort and identified SNPs. GWAS was carried out using the Illumina 500K Onco SNP array. Study is National IRB approved. Funded by MOST.
More than 40,000 participants in case-control studies of breast/colorectal/lung/gynecological/pancreato-hepato-biliary cancers. For each participant we have long entry questionnaire (800 questions: health habits, health status, family history, more…), blood sample (DNA), tumor tissue sample (for many), EMR of follow-up. Every cancer case has a matched control without cancer. All studies are National IRB approved. Partially funded by various agencies, BCRF, ICRF…
We investigate analytic and numerical solutions of nonlinear gradient flows. We examine the flows as nonlinear PDE’s and use tools from nonlinear spectral theory. We have recently revealed relations between Dynamic mode decomposition (DMD), a common tool for fluid dynamics, and nonlinear eigenfunctions related to homogeneous flows. We are investigating through this lens gradient descent algorithms of complex systems.
Statistically reasoning about complex systems involves a probability distribution over exponentially many configurations. For example, semantic labeling of an image requires to infer a discrete label for each image pixel, hence resulting in possible segmentations which are exponential in the numbers of pixels. Standard approaches such as Gibbs sampling are slow in practice and cannot be applied to many real-life problems. Our goal is to integrate optimization and sampling through extreme value statistics and to define new statistical framework for which sampling and parameter estimation in complex systems are efficient. This framework is based on measuring the stability of prediction to random changes in the potential interactions.
Deep learning revolutionized AI and machine learning techniques can be used to achieve human-like behavior. To better address complex tasks such as visual-dialog or visual navigation we designed a general attention mechanism that use a factor graph based attention mechanism which can combines high-dimensional information that govern complex tasks. This framework allowed us to win the visual dialog challenge of CVPR 2020
Cellular channels are increasingly used for sensitive real-time applications. For example, real time video can now be broadcast over parallel cellular channel, possibly from a moving vehicle. Such channels are characterized by high variability, and require improved flow control algorithms to maintain stable flow. This work addresses the application of deep learning algorithms to develop suitable flow control and scheduling algorithm under real-time delay constraints.
The need to create quantum states of light, such as entangled photons, arises from their importance in the fields of quantum information and quantum optics. In recent years, quantum cluster states were used in quantum computation, entangled photons were used to demonstrate quantum teleportation, and quantum hyper-dense coding protocols enable breaking the classical limit for information transfer. All these applications require efficient methods for generation of quantum light.
Our project develops new approaches for creating many-photon quantum light, by using recent advances in quantum electrodynamics and quantum optics. These advances are especially promising for creating deterministic, heralded, entangled photon sources.
Linear recursions with integer coefficients, such as the recursion of the Fibonacci sequence, have been intensely studied over millennia, yet still hide interesting undiscovered mathematics. Such a recursion was used by Apéry in his proof of the irrationality of certain values of the Riemann zeta function. Similar recursions can prove the irrationality of other fundamental constants such as π and e. However, it is not generally known under what conditions a linear recursion can be used to prove irrationality.
Our project develops new hypotheses and proofs for linear recursions. Specifically, we generalize Apéry’s work, finding the conditions for which similar recursions can be used to prove irrationality.
Looking forward, we would like to search for a wider theory on sequences created by any linear recursion with integer coefficients. Such results can help develop systematic algorithms for finding formulas for fundamental constants and contribute to ongoing efforts to answer open questions like proving the irrationality of values of the Reimann zeta function (e.g., ζ(5)).
Fundamental mathematical constants like e and π are ubiquitous in diverse fields of science, from abstract mathematics to physics and biology. For centuries, new formulas relating fundamental constants have been scarce and usually discovered sporadically.
Our project develops systematic approaches to leverages algorithms for deriving formulas for fundamental constants and help reveal their underlying structure.
This research reverses the conventional approach of sequential logic in formal proofs. Instead, our algorithms utilize numerical data to unveil mathematical structures, trying to play the role of intuition of great mathematicians of the past to find leads for future research.
We introduced an unconditional generative model that can be learned from a single natural image. Our model, coined SinGAN, is trained to capture the internal distribution of patches within the image, and is then able to generate high quality, diverse samples of arbitrary size and aspect ratio, that carry the same visual content as the image. We illustrated the utility of SinGAN in a wide range of image manipulation tasks. This work won the Best Paper Award (Marr Prize) at ICCV`19.
Improvements in training speed are needed to develop the next generation of deep learning models. To perform such a massive amount of computation in a reasonable time, it is parallelized across multiple GPU cores. Perhaps the most popular parallelization method is to use a large batch of data in each iteration of SGD, so the gradient computation can be performed in parallel on multiple workers. We aim to enable massive parallelization without performance degradation, as commonly observed.
We aim to improve the resource efficiency of deep learning (e.g., energy, bandwidth) for training and inference. Our focus is decreasing the numerical precision of the neural network model is a simple and effective way to improve their resource efficiency. Nearly all recent deep learning related hardware relies heavily on lower precision math. The benefits are a reduction in the memory required to store the neural network, a reduction in chip area, and a drastic improvement in energy efficiency.
Significant research efforts are being invested in improving Deep Neural Networks (DNNs) via various modifications. However, such modifications often cause an unexplained degradation in the generalization performance DNNs to unseen data. Recent findings suggest that this degradation is caused by changes to the hidden algorithmic bias of the training algorithm and model. This bias determines which solution is selected from all solutions which fit the data. We aim to understand and control this algorithmic bias.
This project seeks to elucidate the mechanisms of information storage and processing in machine learning systems of human language, by (a) measuring localization and distributivity of information in complex models; (b) discovering causal relationships between model components and automatic (potentially biased) decisions; and (c) making language processing systems more interpretable and controllable. The research is expected to promote responsible and accountable adoption of language technology.
Despite the empirical success of deep learning models in natural language processing (NLP), these models face two challenges: they are opaque and difficult to interpret; and they are fragile and not robust to shifts in the data distribution. This project studies the relationship between interpretability and robustness in NLP: are more robust models also more interpretable, and vice versa? This research is expected to facilitate the development of models that more trustworthy, fair, and reliable.
Information recorded by service systems (e.g., in the telecommunication, finance, and health sectors) during their operation provides an angle for operational process analysis, commonly referred to as process mining. Here we establish a queueing perspective in process mining to address the online delay prediction problem, which refers to the time that the execution of an activity for a running instance of a service process is delayed due to queueing effects. We develop predictors for waiting-times from event logs recorded by an information system during process execution. Based on large datasets from the telecommunications and financial sectors, our evaluation demonstrate accurate online predictions, which drastically improve over predictors neglecting the queueing perspective.
Service systems are often stochastic and preplanned by appointments, yet implementations of their appointment systems are prevalently deterministic. We address this gap, between planned and reality, by developing data-driven methods for appointment scheduling and sequencing – the result are tractable and scalable solutions that accommodate hundreds of jobs and servers. To test for practical performance, we leverage a unique data set from a cancer center that combines real-time locations, electronic health records, and appointments log. Focusing on one of the center’s infusion units, we reduce cost (waiting plus overtime) on the order of 15%–40% consistently.
Motivated by applications in machine learning and archival storage, we introduce function-correcting codes (FCCs), a new class of codes to protect a function evaluation of the data against errors. We show that FCCs are equivalent to irregular-distance codes, i.e., codes that obey some given distance requirement between each pair of codewords. Using these connections, we study these codes and derive general upper and lower bounds on their optimal redundancy. Since these bounds depend on the specific function, we provide simplified, suboptimal bounds that are easier to evaluate.
Private information retrieval (PIR) protocols make it possible to retrieve a file from a database without disclosing any information about the identity of the file being retrieved. While existing protocols strictly impose that no information is leaked on the file’s identity, this project initiates the study of the tradeoffs that can be achieved by relaxing the requirement of perfect privacy. We propose to study this problem when the database is either replicated or is stored distributively over several servers, and when it is simply stored by a single server.
In the trace reconstruction problem a length-n string x yields a collection of noisy traces, where each is independently obtained from x by passing through a deletion channel, which deletes every symbol with some fixed probability. The main goal under this paradigm is to determine the required minimum number of i.i.d traces in order to reconstruct x with high probability. The focus of this work is to extend this problem to the model where each trace is a result of x passing through a deletion-insertion-substitution channel.
We study the manifold of diffusion operators, on which we can define geometric, differential, and probabilistic structures. This research direction entails a fresh approach to multi-manifold learning, departing from the traditional use of spectral decomposition of diffusion operators for embedding.
Diffusion operators are positive (semi-)definite and have a particular Riemannian geometry. While each diffusion operator extracts the manifold of a single data set, transportation of diffusion operators on the associated Riemannian manifold enables us to merge and compare multiple data sets.
One of the long-standing challenges in signal processing and data analysis is the fusion of information acquired by multiple, multimodal sensors.
Of particular interest in the context of our research are the massive data sets of medical recordings and healthcare-related information, acquired routinely in operation rooms, intensive care units, and clinics. Such distinct and complementary information calls for the development of new theories and methods, leveraging it toward achieving concrete objectives such as analysis, filtering, and prediction, in a broad range of fields.
Integer Programming is a fundamental framework for discrete optimization with generic modeling power and numerous applications. We are developing an algebraic theory that enables to solve large integer programming problems with large numbers of variables over sparse systems.
In particular, we have recently shown that integer programming is fixed-parameter tractable when parameterized by the numeric measure and the sparsity measure of the system at hand.
Consider a group of workers who answered questions, which have a correct yet unknown answers. The workers are heterogenous, they could be ordinary people, trained volunteers, a panel of experts, different computer algorithms, or a mix of all the above. Our approach is based on empirical Bayes methods and the aim is to construct an algorithm that aggregates all workers’ answers to a single output that is close to the unknown truth. (MSc student: Tsviel Ben-Shabat, co-advisor: Reshef Meir)
Intensive care medicine is complex, resource intensive and expensive. It is a dynamic and highly technical field of medicine, taking care of the sickest patients. Decisions need to be made rapidly based on the evolving clinical state of the patient which can fluctuate over seconds and minutes. We develop ML models to tackle major predictive challenges for critically-ill patients. Specifically, models that predict an upcoming possible adverse event to provide the clinical team time to intervene and thus improve outcome and save lives, and models that predict the future course and treatment response in a patient-specific manner.
Pulse oximetry is routinely used for monitoring patient’s oxygen saturation level non-invasively. A low oxygen level in the blood means low oxygen in the tissues and ultimately this can lead to organ failure. The development of digital oximetry biomarkers (OBM) engineered from the oxygen saturation time series can support diagnosis, characterize subgroups of patients with various disease severity (phenotyping) and enable continuous monitoring of patient’s pulmonary function to predict eventual deteriorations (prognosis). We create new OBM and ML models for the diagnosis of respiratory conditions such as obstructive sleep apnea, chronic obstructive pulmonary disease and pneumonia.
Major cardiovascular and cerebrovascular events occur in individuals without known pre-existing cardiovascular conditions. Preventing such events remains a serious public health challenge. For that purpose, clinical risk scores can be used to identify individuals with high cardiovascular risks. However, available scoring scales have shown moderate performance. Despite being part of the routine evaluation of many patients in both primary and specialized care, the role of electrocardiogram (ECG) analysis in cardiovascular disease prediction and, hence, prevention is not as clear. We research digital biomarkers and deep representation learning approaches to cardiovascular diseases risk prediction using the ECG.
DNA information is rapidly growing in importance, and blowing up in volumes. Most data compressors for DNA have extreme encoding complexities, which is prohibitive for low-cost and portable sequencers. We develop a compression scheme with minimal encoding complexity, taking advantage of the availability of DNA references and computation resources in the cloud.
When a distributed storage system is used by decentralized applications (for example: blockchains), accessing individual shards of large data units, new features are needed that are not offered by existing distributed storage systems. In particular, coding the data with standard erasure codes does not allow adequate access performance. We develop erasure codes specifically addressing efficient recovery and access in decentralized applications.
The common use of AI today is that data is provided to some central computing facility (in the cloud), where the learning tasks (training and inference) are performed. The main issues with this practice are high communication cost and compromised data privacy. Moving part of the learning tasks to the edges mitigates these issues. The key question is how to aggregate multiple unreliable outputs from the edge to one reliable learning output, where unreliability is manifested in: missing inputs (stragglers), wrong inputs, and malicious inputs.
Data deduplication is one of the most effective ways to reduce data size in large-scale systems. In a nutshell, duplicate copies of data chunks in different files are replaced with pointers to a single copy of each unique chunk. Optimized deduplication mechanisms facilitated its adoption to online primary storage, introducing new complexities to which traditional solutions do not directly apply. Our objective is to optimize capacity planning, management and load balancing in such systems.
The infrastructure for the “big data revolution” is built of systems that support storing, processing, and delivering large amounts of data efficiently. Flash-based solid-state drives (SSDs) are a key component in such systems, thanks to their ability to support parallel I/O at sub-millisecond latency and consistently high throughput. We develop theoretically-optimal algorithms for the SSD firmware which is responsible for the internal management of data and resources within the storage device.
We develop a fundamentally novel paradigm that seeks to find a simplification of a given POMDP problem, which is computationally easier, while at the same time providing performance guarantees, and ideally, similar levels of performance as the original decision making problem.
Based on this conceptually novel paradigm, we develop approaches that simplify the decision making problem, for example, by resorting to belief simplification or reward function simplification.
We develop approaches for autonomous semantic perception addressing key challenges such as: classification aliasing for certain relative viewpoints between object & camera, localization uncertainty, and epistemic uncertainty of the classifier. Specifically, approaches for computationally efficient probabilistic inference and decision making, are developed, in the context of semantic perception and SLAM. A key component here is a learned viewpoint-dependent classifier model.
Language is a window to the person’s mind and soul. Surprisingly, while few would disagree with this statement, most behavior prediction and analysis models do not consider language usage. We develop models that do exactly this, considering both economics setups (where game theory predictions consider only the numerical incentive of the participants) as well as psychological and psychiatric challenges (e.g. predicting suicide risk in the general population based on social media postings). Our goal is to integrate linguistic signals along with other behavioral and medical signals, and provide better prediction capabilities along with improved understanding of the underlying phenomena.
A fundamental problem of machine and deep learning models in NLP is that of spurious correlations. Such heavily parametrized models often capture data-driven patterns that are correlated with their task variables, but these patterns have little connection to the actual task they are trying to perform.
This, in turn, substantially harms their generalization capacity. We hence develop methods that follow the causal inference methodology for improved model generalization, interpretation, and stability.
Domain adaptation is the problem of adapting an algorithm trained on one domain (training distribution) so that it can effectively process data from other domains (e.g. adapting a sentiment classification algorithm trained on book reviews so that it can perform well on reviews of patient experience in clinics). We consider various very challenging setups of domain adaptation, focusing on setups where very limited resources and knowledge of the target domains are available when training the algorithm.
Metastases cause ~90% of cancer mortality and prognosis is currently based on histopathology, disease-statistics, or genetics. Weihs lab developed a rapid (~2hr) early-prognostic of the clinical metastatic risk, adding predictive machine learning models, to support disease management.
Two-class and 5-class models successfully separated invasive/non-invasive or varying invasiveness-level samples with high sensitivity and specificity.
In this project, we propose a method for computing global Chebyshev nets on triangular meshes. We formulate the corresponding global parameterization problem in terms of commuting PolyVector fields, and design an efficient optimization method to solve it. We compute, for the first time, Chebyshev nets with automatically-placed singularities, and demonstrate the realizability of our approach using real material.
The first steps of embryogenesis lack transcription and rely on maternal mRNAs stored in oocytes. Thus, maternal mRNA stability is tightly regulated. A-to-I RNA editing is the most common RNA modification, which is important for normal embryonic development and regulation of innate immunity. Using dozens high-throughput sequencing databases, we are testing if edited mRNAs are inherited to prevent activation of the immunity system against self RNA in the next generations.
A-to-I RNA editing is the most prevalent type of RNA editing in metazoans. As part of this project, we generated RESIC, an efficient pipeline that combines several approaches for the detection and classification of RNA editing sites. The pipeline can be used for all organisms and can use any number of RNA-sequencing datasets as input. Testing this tool on SARS-CoV-2 infection, our analysis implies the involvement of RNA editing in conceiving the unpredicted phenotype of COVID-19 disease.
Differential Expression Analysis (DEA) of RNA-sequencing data is frequently performed for detecting key genes, affected across different conditions. Preceding reliability-testing of the input material is crucial for consistent and strong results, yet can be challenging. In this project, we generated a tool: Biological Sequence Expression Kit (BiSEK) – a UI-based platform for DEA, dedicated to a reliable inquiry.
BiSEK is based on a novel algorithm to track discrepancies between the data and the statistical model design.
Modern computing systems are limited by the need to move data between the processing units and the memory (“memory wall”). We developed a unit that combines the data processing and storage using the same physical cells using memristive devices. This unit, called mMPU, can execute numerous logical operations simultaneously, offering an energy efficient, high performance machine that is backward compatible with standard computer architectures. The mMPU is especially efficient for applications such as genomics, databases, image processing, DNN and BNN.
Data converters (analog to digital and digital to analog) are ubiquities in modern electronic devices and connect the real world with digital computing systems. These converters suffer from the speed-accuracy-power tradeoff. We use neuromorphic computing to build data converters that can be trained to adjust to different applications and environmental changes and by that achieve a better figure-of-merit compared to standard data converters.
We use emerging memristive technologies to design circuits and systems that accelerate deep neural networks, including their training. Our recent work has shown how to accelerate vanilla gradient descent and gradient descent with momentum using memristors. Our proposed circuits rely on using memristors to both compute and store the weights.
Models of real-world phenomena, e.g., human physiology, offer significant utility in health and disease.
However, they often suffer from misspecification. To understand the implications of such misspecification, we develop some basic theory for the simple setting of linear models, aiming to understand the benefit of the ubiquitously available unlabelled offline data in enhancing misspecified causal models. We implement these ideas on non-linear models, focusing on the cardiovascular system, where an abundance of unlabelled data and (partial) physiological models are available.
The effectiveness of learning systems depends on both the attributes of the learner and the teacher. Indeed, an optimal setup for learning is when the student and teacher/environment operate collaboratively to enhance learning, where the teacher’s task is to develop an appropriate learning curriculum that facilitates learning by the student. We develop approaches to enhance agents’ learning within a curriculum setting, focusing on the model-based Reinforcement Learning agents and continuous control settings.
Effective learning from data requires prior assumptions, referred to as inductive bias. A fundamental question pertains to the source of a ‘good’ inductive bias. One natural way to form such a bias is through lifelong learning, where an agent continually interacts with the world through a sequence of tasks, aiming to improve its performance on future tasks based on the tasks it has encountered so far. We develop a theoretical framework for incremental inductive bias formation, and demonstrate its effectiveness in problems of sequential learning and decision making.
We are developing a smartphone app for cardiologists to help analyze ECG charts. Our methods identify dozens of cardio-related conditions: “Automatic classification of healthy and disease conditions from images or digital standard 12-lead ECGs.” Vadim Gliner, Noam Keidar, Vladimir Makarov, Arutyun I. Avetisyan, Assaf Schuster and Yael Yaniv. Scientific Reports. September 2020. We develop tools to assist physicians use AI tools: “Meeting the unmet needs of clinicians from AI systems in cardiology: A systematic formulation, and a suggested framework.” Yonatan Elul, Aviv Rosenberg, Assaf Schuster, Alex Bronstein, Yael Yaniv. Proceedings of the National Academy of Sciences of the United States of America (PNAS). April 2021. We are working on predicting cardiovascular events.
We developed asynchronous versions of data parallel training and showed them to be faster than their synchronous counterparts : “Taming Momentum in a Distributed Asynchronous Environment.” Ido Hakimi, Saar Barkai, Moshe Gabel, Assaf Schuster. Aug 2019, arXiv. We also solved the issue associated with asynchrony, named “staleness”: “Gap-Aware Mitigation of Gradient Staleness.” Saar Barkai, Ido Hakimi, Assaf Schuster. ICLR 2020. We developed model parallel approach for fine tuning of giant deep models on commodity hardware (submitted for publication).