The case for open computer programsScientific reproducibility now very often depends on the computational method being available to duplicate, so here it is argued that all source code should be freely available. Most scientific papers published today rely on computer programs for data collection and manipulation. Writing in the Perspective pages in this issue of Nature, Darrel Ince and colleagues argue that the policies of most journals and funding bodies towards the release of computer codes as part of the publication process are obsolete. They say that the full release of actual source code should be the norm for any scientific results dependent on computation, with an agreed list of exceptions applicable only to rare cases. Current policies range from a requirement for release of the relevant computer programs on request to Nature's less stringent stipulation of a 'natural language' description of computer algorithms. Scientific communication relies on evidence that cannot be entirely included in publications, but the rise of computational science has added a new layer of inaccessibility. Although it is now accepted that data should be made available on request, the current regulations regarding the availability of software are inconsistent. We argue that, with some exceptions, anything less than the release of source programs is intolerable for results that depend on computation. The vagaries of hardware, software and natural language will always ensure that exact reproducibility remains uncertain, but withholding code increases the chances that efforts to reproduce results will fail.
Depth migration of imaged time sectionsAbstract None of the leading approaches to the migration of seismic sections—the Kirchhoff-summation method, the finite-difference method, or the frequency-domain method—readily migrates seismic reflections to their proper positions when overburden velocities vary laterally. For inhomogeneous media, the diffraction curve for a localized, buried scatterer is no longer hyperbolic and its apex is displaced laterally from the position directly above the scatterer. Hubral observed that the Kirchhoff-summation method images seismic data at emergent “image ray” locations rather than at the desired positions vertically above scatterers. In addition, distortions in diffraction shapes lead to incorrect imaging (i.e., incomplete diffraction collapse) and, hence, to further displacement errors for dipping reflections. The finite-difference method has been believed to continue waves downward correctly through inhomogeneous media. In conventional implementations, however, both the finite-difference method and frequency-domain approach commit the same error that the Kirchhoff method does. Synthetic examples demonstrate how conventional migration fails to image events completely. Hubral’s solution to this migration problem is two- (or three-) dimensional mapping of imaged time sections into depth. This mapping, “depth migration,” replaces simple vertical conversion from time to depth. Such depth migration can be postponed until after efficient image-ray modeling has been performed to (1) support the final choice of velocity model, and (2) determine whether depth migration is necessary. Comparisons between depth-migrated and conventionally depth-converted sections of both synthetic and field data properly show that significant lateral displacement is often required to position reflectors properly. Monte Carlo studies show that the lateral corrections can be important not only in absolute terms but also in relation to errors expected from an inaccurate velocity model.
Statistical Estimation of the Residential BaselineDemand response on the residential market is becoming a solution to adapt customer consumption to the offer available and therefore lower the electricity peak prices. Tariff incentives and direct load control of residential air-conditioners and electric heaters are flexible solutions to reduce the peak demand. To include residential demand response resources in planning operators, quantifying the demand reduction is becoming a major issue for all electrical stakeholders. Current methods are based on day or weather matching, regressions and control group approaches. In general, methods using available data from a control group give more accurate results. With the introduction of smart meters, the electric utilities generate a large amount of quality data, available almost in real time. In this paper, we suggest using these available residential load curves to select a control group based on individual load curves. One of the advantages of our method is that the selected control group could adapt at anytime to the number of individuals belonging to the demand reduction program, as this number evolves with customers entering and leaving the program. Constrained regression methods and an algorithm are developed and evaluated on real data, providing a reliable solution for an operational use.
Protein Structure and Evolution: Are They Constrained Globally by a Principle Derived from Information Theory?That the physicochemical properties of amino acids constrain the structure, function and evolution of proteins is not in doubt. However, principles derived from information theory may also set bounds on the structure (and thus also the evolution) of proteins. Here we analyze the global properties of the full set of proteins in release 13-11 of the SwissProt database, showing by experimental test of predictions from information theory that their collective structure exhibits properties that are consistent with their being guided by a conservation principle. This principle (Conservation of Information) defines the global properties of systems composed of discrete components each of which is in turn assembled from discrete smaller pieces. In the system of proteins, each protein is a component, and each protein is assembled from amino acids. Central to this principle is the inter-relationship of the unique amino acid count and total length of a protein and its implications for both average protein length and occurrence of proteins with specific unique amino acid counts. The unique amino acid count is simply the number of distinct amino acids (including those that are post-translationally modified) that occur in a protein, and is independent of the number of times that the particular amino acid occurs in the sequence. Conservation of Information does not operate at the local level (it is independent of the physicochemical properties of the amino acids) where the influences of natural selection are manifest in the variety of protein structure and function that is well understood. Rather, this analysis implies that Conservation of Information would define the global bounds within which the whole system of proteins is constrained; thus it appears to be acting to constrain evolution at a level different from natural selection, a conclusion that appears counter-intuitive but is supported by the studies described herein.
EC—a measurement based safer subset of ISO C suitable for embedded system developmentLeslie Hatton|Information and Software Technology|2004