Home Page Contact Us linkedin icon

Research

Projects

Use user behavior to improve automatic database schema matching
Database schema matching is a challenging task that call for improvement for several decades. Automatic algorithms fail to provide reliable enough results. We use human matching to overcome algorithm failures and vice versa. We refer to human and algorithmic matchers as imperfect matchers with different strengths and weaknesses.... more

Use user behavior to improve automatic database schema matching

Database schema matching is a challenging task that call for improvement for several decades. Automatic algorithms fail to provide reliable enough results. We use human matching to overcome algorithm failures and vice versa. We refer to human and algorithmic matchers as imperfect matchers with different strengths and weaknesses. We use insights from cognitive research to predict human matchers behavior and identify those who can do better than others. We then merge their responses with algorithmic outcomes and get better results.

Massive Parallelization of Deep Learning
Improvements in training speed are needed to develop the next generation of deep learning models. To perform such a massive amount of computation in a reasonable time, it is parallelized across multiple GPU cores. Perhaps the most popular parallelization method is to use a large batch of data in each iteration of SGD, so the gra... more

Massive Parallelization of Deep Learning

Improvements in training speed are needed to develop the next generation of deep learning models. To perform such a massive amount of computation in a reasonable time, it is parallelized across multiple GPU cores. Perhaps the most popular parallelization method is to use a large batch of data in each iteration of SGD, so the gradient computation can be performed in parallel on multiple workers. We aim to enable massive parallelization without performance degradation, as commonly observed.

Resource efficient deep learning
We aim to improve the resource efficiency of deep learning (e.g., energy, bandwidth) for training and inference. Our focus is decreasing the numerical precision of the neural network model is a simple and effective way to improve their resource efficiency. Nearly all recent deep learning related hardware relies heavily on lower ... more

Resource efficient deep learning

We aim to improve the resource efficiency of deep learning (e.g., energy, bandwidth) for training and inference. Our focus is decreasing the numerical precision of the neural network model is a simple and effective way to improve their resource efficiency. Nearly all recent deep learning related hardware relies heavily on lower precision math. The benefits are a reduction in the memory required to store the neural network, a reduction in chip area, and a drastic improvement in energy efficiency.

Understanding and controlling the implicit bias in deep learning
Significant research efforts are being invested in improving Deep Neural Networks (DNNs) via various modifications. However, such modifications often cause an unexplained degradation in the generalization performance DNNs to unseen data. Recent findings suggest that this degradation is caused by changes to the hidden algorithmi... more

Understanding and controlling the implicit bias in deep learning

Significant research efforts are being invested in improving Deep Neural Networks (DNNs) via various modifications. However, such modifications often cause an unexplained degradation in the generalization performance DNNs to unseen data. Recent findings suggest that this degradation is caused by changes to the hidden algorithmic bias of the training algorithm and model. This bias determines which solution is selected from all solutions which fit the data. We aim to understand and control this algorithmic bias.

Function-Correcting Codes
Motivated by applications in machine learning and archival storage, we introduce function-correcting codes (FCCs), a new class of codes to protect a function evaluation of the data against errors. We show that FCCs are equivalent to irregular-distance codes, i.e., codes that obey some given distance requirement between each pair... more

Function-Correcting Codes

Motivated by applications in machine learning and archival storage, we introduce function-correcting codes (FCCs), a new class of codes to protect a function evaluation of the data against errors. We show that FCCs are equivalent to irregular-distance codes, i.e., codes that obey some given distance requirement between each pair of codewords. Using these connections, we study these codes and derive general upper and lower bounds on their optimal redundancy. Since these bounds depend on the specific function, we provide simplified, suboptimal bounds that are easier to evaluate.

Weakly Private Information Retrieval
Private information retrieval (PIR) protocols make it possible to retrieve a file from a database without disclosing any information about the identity of the file being retrieved. While existing protocols strictly impose that no information is leaked on the file's identity, this project initiates the study of the tradeoffs that... more

Weakly Private Information Retrieval

Private information retrieval (PIR) protocols make it possible to retrieve a file from a database without disclosing any information about the identity of the file being retrieved. While existing protocols strictly impose that no information is leaked on the file’s identity, this project initiates the study of the tradeoffs that can be achieved by relaxing the requirement of perfect privacy. We propose to study this problem when the database is either replicated or is stored distributively over several servers, and when it is simply stored by a single server.

Weakly Private Information Retrieval
Private information retrieval (PIR) protocols make it possible to retrieve a file from a database without disclosing any information about the identity of the file being retrieved. While existing protocols strictly impose that no information is leaked on the file's identity, this project initiates the study of the tradeoffs that... more

Weakly Private Information Retrieval

Private information retrieval (PIR) protocols make it possible to retrieve a file from a database without disclosing any information about the identity of the file being retrieved. While existing protocols strictly impose that no information is leaked on the file’s identity, this project initiates the study of the tradeoffs that can be achieved by relaxing the requirement of perfect privacy. We propose to study this problem when the database is either replicated or is stored distributively over several servers, and when it is simply stored by a single server.

Distributed Storage and Computation through Coded Sharding
When a distributed storage system is used by decentralized applications (for example: blockchains), accessing individual shards of large data units, new features are needed that are not offered by existing distributed storage systems. In particular, coding the data with standard erasure codes does not allow adequate access perfo... more

Distributed Storage and Computation through Coded Sharding

When a distributed storage system is used by decentralized applications (for example: blockchains), accessing individual shards of large data units, new features are needed that are not offered by existing distributed storage systems. In particular, coding the data with standard erasure codes does not allow adequate access performance. We develop erasure codes specifically addressing efficient recovery and access in decentralized applications.

Reliability of Machine Learning in Distributed Systems
The common use of AI today is that data is provided to some central computing facility (in the cloud), where the learning tasks (training and inference) are performed. The main issues with this practice are high communication cost and compromised data privacy. Moving part of the learning tasks to the edges mitigates these issues... more

Reliability of Machine Learning in Distributed Systems

The common use of AI today is that data is provided to some central computing facility (in the cloud), where the learning tasks (training and inference) are performed. The main issues with this practice are high communication cost and compromised data privacy. Moving part of the learning tasks to the edges mitigates these issues. The key question is how to aggregate multiple unreliable outputs from the edge to one reliable learning output, where unreliability is manifested in: missing inputs (stragglers), wrong inputs, and malicious inputs.

Certified Robustness of Modern Machine Learning
Develop methodologies that provide provably robust predictions in a challenging setting where the train and test distribution differ, e.g., due to adversarial attacks.
Online POMDP and BSP Planning via Simplification
We develop a fundamentally novel paradigm that seeks to find a simplification of a given POMDP problem, which is computationally easier, while at the same time providing performance guarantees, and ideally, similar levels of performance as the original decision making problem. Based on this conceptually novel paradigm, we devel... more

Online POMDP and BSP Planning via Simplification

We develop a fundamentally novel paradigm that seeks to find a simplification of a given POMDP problem, which is computationally easier, while at the same time providing performance guarantees, and ideally, similar levels of performance as the original decision making problem.
Based on this conceptually novel paradigm, we develop approaches that simplify the decision making problem, for example, by resorting to belief simplification or reward function simplification.

Autonomous Semantic Perception under Uncertainty
We develop approaches for autonomous semantic perception addressing key challenges such as: classification aliasing for certain relative viewpoints between object & camera, localization uncertainty, and epistemic uncertainty of the classifier. Specifically, approaches for computationally efficient probabilistic inference and... more

Autonomous Semantic Perception under Uncertainty

We develop approaches for autonomous semantic perception addressing key challenges such as: classification aliasing for certain relative viewpoints between object & camera, localization uncertainty, and epistemic uncertainty of the classifier. Specifically, approaches for computationally efficient probabilistic inference and decision making, are developed, in the context of semantic perception and SLAM. A key component here is a learned viewpoint-dependent classifier model.

Online and bandit optimization
In this project we study how to make decisions in an unknown environment in an online setting.
People:
Nir Ailon
Large matrix approximation for acceleration of deep networks
In this work we apply matrix approximation theory to reduce the cost of training and deploying of dense layers in deep networks.
People:
Nir Ailon