Knowing a company’s true sales can help determine its value. Investors, for instance, often employ financial analysts to predict a company’s upcoming earnings using various public data, computational tools, and their own intuition. Now MIT researchers have developed an automated model that significantly outperforms humans in predicting business sales using very limited, “noisy” data.
In finance, there’s growing interest in using imprecise but frequently generated consumer data — called “alternative data” — to help predict a company’s earnings for trading and investment purposes. Alternative data can comprise credit card purchases, location data from smartphones, or even satellite images showing how many cars are parked in a retailer’s lot. Combining alternative data with more traditional but infrequent ground-truth financial data — such as quarterly earnings, press releases, and stock prices — can paint a clearer picture of a company’s financial health on even a daily or weekly basis.
But, so far, it’s been very difficult to get accurate, frequent estimates using alternative data. In a paper published this week in the Proceedings of ACM Sigmetrics Conference, the researchers describe a model for forecasting financials that uses only anonymized weekly credit card transactions and three-month earning reports.
Tasked with predicting quarterly earnings of more than 30 companies, the model outperformed the combined estimates of expert Wall Street analysts on 57 percent of predictions. Notably, the analysts had access to any available private or public data and other machine-learning models, while the researchers’ model used a very small dataset of the two data types.
“Alternative data are these weird, proxy signals to help track the underlying financials of a company,” says first author Michael Fleder, a postdoc in the Laboratory for Information and Decision Systems (LIDS). “We asked, ‘Can you combine these noisy signals with quarterly numbers to estimate the true financials of a company at high frequencies?’ Turns out the answer is yes.”
The model could give an edge to investors, traders, or companies looking to frequently compare their sales with competitors. Beyond finance, the model could help social and political scientists, for example, to study aggregated, anonymous data on public behavior. “It’ll be useful for anyone who wants to figure out what people are doing,” Fleder says.
Joining Fleder on the paper is EECS Professor Devavrat Shah, who is the director of MIT’s Statistics and Data Science Center, a member of the Laboratory for Information and Decision Systems, a principal investigator for the MIT Institute for Foundations of Data Science, and an adjunct professor at the Tata Institute of Fundamental Research.
Tackling the “small data” problem
For better or worse, a lot of consumer data is up for sale. Retailers, for instance, can buy credit card transactions or location data to see how many people are shopping at a competitor. Advertisers can use the data to see how their advertisements are impacting sales. But getting those answers still primarily relies on humans. No machine-learning model has been able to adequately crunch the numbers.
Counterintuitively, the problem is actually lack of data. Each financial input, such as a quarterly report or weekly credit card total, is only one number. Quarterly reports over two years total only eight data points. Credit card data for, say, every week over the same period is only roughly another 100 “noisy” data points, meaning they contain potentially uninterpretable information.
“We have a ‘small data’ problem,” Fleder says. “You only get a tiny slice of what people are spending and you have to extrapolate and infer what’s really going on from that fraction of data.”
For their work, the researchers obtained consumer credit card transactions — at typically weekly and biweekly intervals — and quarterly reports for 34 retailers from 2015 to 2018 from a hedge fund. Across all companies, they gathered 306 quarters-worth of data in total.
Computing daily sales is fairly simple in concept. The model assumes a company’s daily sales remain similar, only slightly decreasing or increasing from one day to the next. Mathematically, that means sales values for consecutive days are multiplied by some constant value plus some statistical noise value — which captures some of the inherent randomness in a company’s sales. Tomorrow’s sales, for instance, equal today’s sales multiplied by, say, 0.998 or 1.01, plus the estimated number for noise.
If given accurate model parameters for the daily constant and noise level, a standard inference algorithm can calculate that equation to output an accurate forecast of daily sales. But the trick is calculating those parameters.
Untangling the numbers
That’s where quarterly reports and probability techniques come in handy. In a simple world, a quarterly report could be divided by, say, 90 days to calculate the daily sales (implying sales are roughly constant day-to-day). In reality, sales vary from day to day. Also, including alternative data to help understand how sales vary over a quarter complicates matters: Apart from being noisy, purchased credit card data always consist of some indeterminate fraction of the total sales. All that makes it very difficult to know how exactly the credit card totals factor into the overall sales estimate.
“That requires a bit of untangling the numbers,” Fleder says. “If we observe 1 percent of a company’s weekly sales through credit card transactions, how do we know it’s 1 percent? And, if the credit card data is noisy, how do you know how noisy it is? We don’t have access to the ground truth for daily or weekly sales totals. But the quarterly aggregates help us reason about those totals.”
To do so, the researchers use a variation of the standard inference algorithm, called Kalman filtering or Belief Propagation, which has been used in various technologies from space shuttles to smartphone GPS. Kalman filtering uses data measurements observed over time, containing noise inaccuracies, to generate a probability distribution for unknown variables over a designated timeframe. In the researchers’ work, that means estimating the possible sales of a single day.
To train the model, the technique first breaks down quarterly sales into a set number of measured days, say 90 — allowing sales to vary day-to-day. Then, it matches the observed, noisy credit card data to unknown daily sales. Using the quarterly numbers and some extrapolation, it estimates the fraction of total sales the credit card data likely represents. Then, it calculates each day’s fraction of observed sales, noise level, and an error estimate for how well it made its predictions.
The inference algorithm plugs all those values into the formula to predict daily sales totals. Then, it can sum those totals to get weekly, monthly, or quarterly numbers. Across all 34 companies, the model beat a consensus benchmark — which combines estimates of Wall Street analysts — on 57.2 percent of 306 quarterly predictions.
Next, the researchers are designing the model to analyze a combination of credit card transactions and other alternative data, such as location information. “This isn’t all we can do. This is just a natural starting point,” Fleder says.
Ten years after the publication of their first plan for powering the world with wind, water, and solar, researchers offer an updated vision of the steps that 143 countries around the world can take to attain 100% clean, renewable energy by the year 2050. The new roadmaps, publishing December 20 in the journal One Earth, follow up on previous work that formed the basis for the energy portion of the U.S. Green New Deal and other state, city, and business commitments to 100% clean, renewable energy around the globe — and use the latest energy data available in each country to offer more precise guidance on how to reach those commitments.
In this update, Mark Z. Jacobson of Stanford University and his team find low-cost, stable grid solutions in 24 world regions encompassing the 143 countries. They project that transitioning to clean, renewable energy could reduce worldwide energy needs by 57%, create 28.6 million more jobs than are lost, and reduce energy, health, and climate costs by 91% compared with a business-as-usual analysis. The new paper makes use of updated data about how each country’s energy use is changing, acknowledges lower costs and greater availability of renewable energy and storage technology, includes new countries in its analysis, and accounts for recently built clean, renewable infrastructure in some countries.
“There are a lot of countries that have committed to doing something to counteract the growing impacts of global warming, but they still don’t know exactly what to do,” says Jacobson, a professor of civil and environmental engineering at Stanford and the co-founder of the Solutions Project, a U.S. non-profit educating the public and policymakers about a transition to 100% clean, renewable energy. “How it would work? How it would keep the lights on? To be honest, many of the policymakers and advocates supporting and promoting the Green New Deal don’t have a good idea of the details of what the actual system looks like or what the impact of a transition is. It’s more an abstract concept. So, we’re trying to quantify it and to pin down what one possible system might look like. This work can help fill that void and give countries guidance.”
The roadmaps call for the electrification of all energy sectors, for increased energy efficiency leading to reduced energy use, and for the development of wind, water, and solar infrastructure that can supply 80% of all power by 2030 and 100% of all power by 2050. All energy sectors includes electricity; transportation; building heating and cooling; industry; agriculture, forestry, and fishing; and the military. The researchers’ modeling suggests that the efficiency of electric and hydrogen fuel cell vehicles over fossil fuel vehicles, of electrified industry over fossil industry, and of electric heat pumps over fossil heating and cooling, along with the elimination of energy needed for mining, transporting, and refining fossil fuels, could substantially decrease overall energy use.
The transition to wind, water, and solar would require an initial investment of $73 trillion worldwide, but this would pay for itself over time by energy sales. In addition, clean, renewable energy is cheaper to generate over time than are fossil fuels, so the investment reduces annual energy costs significantly. In addition, it reduces air pollution and its health impacts, and only requires 0.17% of the 143 countries’ total land area for new infrastructure and 0.48% of their total land area for spacing purposes, such as between wind turbines.
“We find that by electrifying everything with clean, renewable energy, we reduce power demand by about 57%,” Jacobson says. “So even if the cost per unit of energy is similar, the cost that people pay in the aggregate for energy is 61% less. And that’s before we account for the social cost, which includes the costs we will save by mitigating health and climate damage. That’s why the Green New Deal is such a good deal. You’re reducing energy costs by 60% and social costs by 91%.”
In the U.S., this roadmap — which corresponds to the energy portion of the Green New Deal, which will eliminate the use of all fossil fuels for energy in the U.S. — requires an upfront investment of $7.8 trillion. It calls for the construction of 288,000 new large (5 megawatt) wind turbines and 16,000 large (100 megawatt) solar farms on just 1.08% of U.S. land, with over 85% of that land used for spacing between wind turbines. The spacing land can double, for instance, as farmland. The plan creates 3.1 million more U.S. jobs than the business-as-usual case, and saves 63,000 lives from air pollution per year. It reduces energy, health, and climate costs 1.3, 0.7, and 3.1 trillion dollars per year, respectively, compared with the current fossil fuel energy infrastructure.
And the transition is already underway. “We have 11 states, in addition to the District of Columbia, Puerto Rico, and a number of major U.S. cities that have committed to 100% or effectively 100% renewable electric,” Jacobson says. “That means that every time they need new electricity because a coal plant or gas plant retires, they will only select among renewable sources to replace them.”
He believes that individuals, businesses, and lawmakers all have an important role to play in achieving this transition. “If I just wrote this paper and published it and it didn’t have a support network of people who wanted to use this information,” he says, “it would just get lost in the dusty literature. If you want a law passed, you really need the public to be supportive.”
Like any model, this one comes with uncertainties. There are inconsistencies between datasets on energy supply and demand, and the findings depend on the ability to model future energy consumption. The model also assumes the perfect transmission of energy from where it’s plentiful to where it’s needed, with no bottlenecking and no loss of energy along power lines. While this is never the case, many of the assessments were done on countries with small enough grids that the difference is negligible, and Jacobson argues that larger countries like the U.S. can be broken down into smaller grids to make perfect transmission less of a concern. The researchers addressed additional uncertainties by modeling scenarios with high, mean, and low costs of energy, air pollution damage, and climate damage.
The work deliberately focuses only on wind, water, and solar power and excludes nuclear power, “clean coal,” and biofuels. Nuclear power is excluded because it requires 10-19 years between planning and operation and has high costs and acknowledged meltdown, weapons proliferation, mining, and waste risks. “Clean coal” and biofuels are not included because they both cause heavy air pollution and still emit over 50 times more carbon per unit of energy than wind, water, or solar power.
One concern often discussed with wind and solar power is that they may not be able to reliably match energy supplies to the demands of the grid, as they are dependent on weather conditions and time of year. This issue is addressed squarely in the present study in 24 world regions. The study finds that demand can be met by intermittent supply and storage throughout the world. Jacobson and his team found that electrifying all energy sectors actually creates more flexible demand for energy. Flexible demand is demand that does not need to be met immediately. For example, an electric car battery can be charged any time of day or night or an electric heat pump water heater can heat water any time of day or night. Because electrification of all energy sectors creates more flexible demand, matching demand with supply and storage becomes easier in a clean, renewable energy world.
Jacobson also notes that the roadmaps this study offers are not the only possible ones and points to work done by 11 other groups that also found feasible paths to 100% clean, renewable energy. “We’re just trying to lay out one scenario for 143 countries to give people in these and other countries the confidence that yes, this is possible. But there are many solutions and many scenarios that could work. You’re probably not going to predict exactly what’s going to happen, but it’s not like you need to find the needle in the haystack. There are lots of needles in this haystack.”
Bilingual children use as many words as monolingual children when telling a story, and demonstrate high levels of cognitive flexibility, according to new research by University of Alberta scientists.
“We found that the number of words that bilingual children use in their stories is highly correlated with their cognitive flexibility — the ability to switch between thinking about different concepts,” said Elena Nicoladis, lead author and professor in the Department of Psychology in the Faculty of Science. “This suggests that bilinguals are adept at using the medium of storytelling.”
Vocabulary is a strong predictor of school achievement, and so is storytelling. “These results suggest that parents of bilingual children do not need to be concerned about long-term school achievement, said Nicoladis. “In a storytelling context, bilingual kids are able to use this flexibility to convey stories in creative ways.”
The research examined a group of French-English bilingual children who have been taught two languages since birth, rather than learning a second language later in life. Results show that bilingual children used just as many words to tell a story in English as monolingual children. Participants also used just as many words in French as they did in English when telling a story.
Previous research has shown that bilingual children score lower than monolingual children on traditional vocabulary tests, meaning this results are changing our understanding of multiple languages and cognition in children.
“The past research is not surprising,” added Nicoladis. “Learning a word is related to how much time you spend in each language. For bilingual children, time is split between languages. So, unsurprisingly, they tend to have lower vocabularies in each of their languages. However, this research shows that as a function of storytelling, bilingual children are equally strong as monolingual children.”
This research used a new, highly sensitive measure for examining cognitive flexibility, examining a participant’s ability to switch between games with different rules, while maintaining accuracy and reaction time. This study builds on previous research examining vocabulary in bilingual children who have learned English as a second language.
While sifting through fossil soils in the Catskill region near Cairo, New York, researchers uncovered the extensive root system of 386-million-year old primitive trees. The fossils, located about 25 miles from the site previously believed to have the world’s oldest forests, is evidence that the transition toward forests as we know them today began earlier in the Devonian Period than typically believed.
“The Devonian Period represents a time in which the first forest appeared on planet Earth,” says first author William Stein, an emeritus professor of biological science at Binghamton University, New York. “The effects were of first order magnitude, in terms of changes in ecosystems, what happens on the Earth’s surface and oceans, in global atmosphere, CO2 concentration in the atmosphere, and global climate. So many dramatic changes occurred at that time as a result of those original forests that basically, the world has never been the same since.”
Stein, along with collaborators, including Christopher Berry and Jennifer Morris of Cardiff University and Jonathan Leake of the University of Sheffield,have been working in the Catskill region in New York, where in 2012 they uncovered “footprint evidence” of a different fossil forest at Gilboa, which, for many years has been termed the Earth’s oldest forest. The discovery at Cairo, about a 40-minute drive from the original site, now reveals an even older forest with dramatically different composition.
The Cairo site presents three unique root systems, leading Stein and his team to hypothesize that much like today, the forests of the Devonian Period were composed of different trees occupying different places depending on local conditions.
First, Stein and his team identified a rooting system that they believe belonged to a palm tree-like plant called Eospermatopteris. This tree, which was first identified at the Gilboa site, had relatively rudimentary roots. Like a weed, Eospermatopteris likely occupied many environments, explaining its presence at both sites. But its roots had relatively limited range and probably lived only a year or two before dying and being replaced by other roots that would occupy the same space. The researchers also found evidence of a tree called Archaeopteris, which shares a number of characteristics with modern seed plants.
“Archaeopteris seems to reveal the beginning of the future of what forests will ultimately become,” says Stein. “Based on what we know from the body fossil evidence of Archaeopteris prior to this, and now from the rooting evidence that we’ve added at Cairo, these plants are very modern compared to other Devonian plants. Although still dramatically different than modern trees, yet Archaeopteris nevertheless seems to point the way toward the future of forests elements.”
Stein and his team were also surprised to find a third root system in the fossilized soil at Cairo belonging to a tree thought to only exist during the Carboniferous Period and beyond: “scale trees” belonging to the class Lycopsida.
“What we have at Cairo is a rooting structure that appears identical to great trees of the Carboniferous coal swamps with fascinating elongate roots. But no one has yet found body fossil evidence of this group this early in the Devonian.” Stein says. “Our findings are perhaps suggestive that these plants were already in the forest, but perhaps in a different environment, earlier than generally believed. Yet we only have a footprint, and we await additional fossil evidence for confirmation.”
Moving forward, Stein and his team hope to continue investigating the Catskill region and compare their findings with fossil forests around the world.
“It seems to me, worldwide, many of these kinds of environments are preserved in fossil soils. And I’d like to know what happened historically, not just in the Catskills, but everywhere,” Says Stein. “Understanding evolutionary and ecological history — that’s what I find most satisfying.”
As the number and technology of humans has grown, their impact on the natural world now equals or exceeds those of natural processes, according to scientists.
Many researchers formally name this period of human-dominance of natural systems as the Anthropocene era, but there is a heated debate over whether this naming should take place and when the period began.
In a co-authored paper published online in the journal Anthropocene, University of Illinois at Chicago paleontologist Roy Plotnick argues that the fossil record of mammals will provide a clear signal of the Anthropocene.
He and Karen Koy of Missouri Western State University report that the number of humans and their animals greatly exceeds that of wild animals.
As an example, in the state of Michigan alone, humans and their animals compose about 96% of the total mass of animals. There are as many chickens as people in the state, and the same should be true in many places in the United States and the world, they say.
“The chance of a wild animal becoming part of the fossil record has become very small,” said Plotnick, UIC professor of earth and environmental sciences and the paper’s lead author. “Instead, the future mammal record will be mostly cows, pigs, sheep, goats, dogs, cats, etc., and people themselves.”
While humans bury most of their dead in cemeteries and have for centuries, their activities have markedly changed how and where animals are buried.
These impacts include alterations in the distribution and properties of natural sites of preservation, associated with shifts in land use and climate change; the production of novel sites for preservation, such as landfills and cemeteries; and changes in the breakdown of animal and human carcasses.
Additionally, the use of large agricultural equipment and increased domestic animal density due to intensive animal farming likely increases the rate of and changes the kind of damage to bones, according to the paleontologists.
“Fossil mammals occur in caves, ancient lakebeds and river channels, and are usually only teeth and isolated bones,” he said. “Animals that die on farms or in mass deaths due to disease often end up as complete corpses in trenches or landfills, far from water.”
Consequently, the fossils from the world today will be unique in the Earth’s history and unmistakable to paleontologists 100,000 years from now, according to the researchers.
“In the far future, the fossil record of today will have a huge number of complete hominid skeletons, all lined up in rows,” Plot nick said.
Princeton researchers have uncovered new rules governing how objects absorb and emit light, fine-tuning scientists’ control over light and boosting research into next-generation solar and optical devices.
The discovery solves a longstanding problem of scale, where light’s behavior when interacting with tiny objects violates well-established physical constraints observed at larger scales.
“The kinds of effects you get for very small objects are different from the effects you get from very large objects,” said Sean Molesky, a postdoctoral researcher in electrical engineering and the study’s first author. The difference can be observed in moving from a molecule to a grain of sand. “You can’t simultaneously describe both things,” he said.
The problem stems from light’s famous shapeshifting nature. For ordinary objects, light’s movement can be described by straight lines, or rays. But for microscopic objects, light’s wave properties take over and the neat rules of ray optics break down. The effects are significant. In important modern materials, observations at the micron scale showed infrared light radiating at millions of times more energy per unit area than ray optics predicts.
The new rules, published in Physical Review Letters on Dec. 20, tell scientists how much infrared light an object of any scale can be expected to absorb or emit, resolving a decades-old discrepancy between big and small. The work extends a 19th-century concept, known as a blackbody, into a useful modern context. Blackbodies are idealized objects that absorb and emit light with maximum efficiency.
“There’s been a lot of research done to try to understand in practice, for a given material, how one can approach these blackbody limits,” said Alejandro Rodriguez, an associate professor of electrical engineering and the study’s principal investigator. “How can we make a perfect absorber? A perfect emitter?”
“It’s a very old problem that many physicists — including Planck, Einstein and Boltzmann — tackled early on and laid the foundations for the development of quantum mechanics.”
A large body of previous work has shown that structuring objects with nanoscale features can enhance absorption and emission, effectively trapping photons in a tiny hall of mirrors. But no one had defined the fundamental limits of the possible, leaving open major questions about how to assess a design.
No longer confined to brute-force trial and error, the new level of control will allow engineers to optimize designs mathematically for a wide range of future applications. The work is especially important in technologies like solar panels, optical circuits and quantum computers.
Currently, the team’s findings are specific to thermal sources of light, like the sun or like an incandescent bulb. But the researchers hope to generalize the work further to agree with other light sources, like LEDs, fireflies, or arcing bolts of electricity.
The research was supported in part by the National Science Foundation, the Cornell Center for Materials Research, the Defense Advanced Research Projects Agency and the National Science and Engineering Research Council of Canada.
A new study led by Simon Fraser University’s Dean of Science, Prof. Paul Kench, has discovered new evidence of sea-level variability in the central Indian Ocean.
The study, which provides new details about sea levels in the past, concludes that sea levels in the central Indian Ocean have risen by close to a meter in the last two centuries.
Prof. Kench says, “We know that certain types of fossil corals act as important recorders of past sea levels. By measuring the ages and the depths of these fossil corals, we are identifying that there have been periods several hundred years ago that the sea level has been much lower than we thought in parts of the Indian Ocean.”
He says understanding where sea levels have been historically, and what happens as they rise, will provide greater insights into how coral reefs systems and islands may be able to respond to the changes in sea levels in the future.
Underscoring the serious threat posed to coastal cities and communities in the region, the ongoing study, which began in 2017, further suggests that if such acceleration continues over the next century, sea levels in the Indian Ocean will have risen to their highest level ever in recorded history.