xref: /onnv-gate/usr/src/cmd/sgs/libelf/common/README.LFS (revision 7833:60e027e61b69)
16223Sab196087#
26223Sab196087# CDDL HEADER START
36223Sab196087#
46223Sab196087# The contents of this file are subject to the terms of the
56223Sab196087# Common Development and Distribution License (the "License").
66223Sab196087# You may not use this file except in compliance with the License.
76223Sab196087#
86223Sab196087# You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
96223Sab196087# or http://www.opensolaris.org/os/licensing.
106223Sab196087# See the License for the specific language governing permissions
116223Sab196087# and limitations under the License.
126223Sab196087#
136223Sab196087# When distributing Covered Code, include this CDDL HEADER in each
146223Sab196087# file and include the License file at usr/src/OPENSOLARIS.LICENSE.
156223Sab196087# If applicable, add the following below this CDDL HEADER, with the
166223Sab196087# fields enclosed by brackets "[]" replaced with your own identifying
176223Sab196087# information: Portions Copyright [yyyy] [name of copyright owner]
186223Sab196087#
196223Sab196087# CDDL HEADER END
206223Sab196087#
216223Sab196087
226223Sab196087#
236223Sab196087# Copyright 2008 Sun Microsystems, Inc.  All rights reserved.
246223Sab196087# Use is subject to license terms.
256223Sab196087#
266223Sab196087
276223Sab196087
286223Sab196087Why 32-bit libelf is not Large File Aware
296223Sab196087-----------------------------------------
306223Sab196087
316223Sab196087The ELF format uses unsigned 32-bit integers for offsets, so the
326223Sab196087theoretical limit on a 32-bit ELF object is 4GB. However, libelf
336223Sab196087imposes a 2GB limit on the objects it can create. The Solaris
346223Sab196087link-editor and related tools are all based on libelf, so the
356223Sab19608732-bit version of the link-editor also has a 2GB limit, despite
366223Sab196087the theoretical limit of 4GB.
376223Sab196087
386223Sab196087Large file support (LFS) is a half step between the 32 and 64-bit
396223Sab196087worlds, in which an otherwise 32-bit limited process is allowed to
406223Sab196087read and write data to a file that can be larger than 2GB (the extent
416223Sab196087of a signed 32-bit integer, as represented by the system type off_t).
426223Sab196087LFS is useful if the program only needs to access a small subset of
436223Sab196087the file data at any given time (e.g. /usr/bin/cat). It is less useful
446223Sab196087if the program needs to access a large amount of data at once --- having
456223Sab196087been freed from the file limit, the program will simply hit the virtual
466223Sab196087memory limit (4GB).
476223Sab196087
486223Sab196087In particular, the link-editor generally requires twice as much
496223Sab196087memory as the size of the output object, half to hold the input
506223Sab196087objects, and half to hold the result. This means that a 32-bit
516223Sab196087link-editor process will hit the 2GB file size limit and the 4GB
526223Sab196087address space limit at roughly the same time. As a result, a
536223Sab196087large file aware 32-bit version of libelf has no significant value.
546223Sab196087Despite this, the question of what it would take to make libelf
556223Sab196087large file aware comes up from time to time.
566223Sab196087
576223Sab196087The first step would be to provide alternative versions of
586223Sab196087all public data structures that involve the off_t data type.
596223Sab196087These structs, found in /usr/include/libelf.h, are:
606223Sab196087
616223Sab196087	/*
626223Sab196087	 * Archive member header
636223Sab196087	 */
646223Sab196087	typedef struct {
656223Sab196087		char		*ar_name;
666223Sab196087		time_t		ar_date;
676223Sab196087		uid_t		ar_uid;
686223Sab196087		gid_t 		ar_gid;
696223Sab196087		mode_t		ar_mode;
706223Sab196087		off_t		ar_size;
716223Sab196087		char 		*ar_rawname;
726223Sab196087	} Elf_Arhdr;
736223Sab196087
746223Sab196087
756223Sab196087	/*
766223Sab196087	 * Data descriptor
776223Sab196087	 */
786223Sab196087	typedef struct {
796223Sab196087		Elf_Void	*d_buf;
806223Sab196087		Elf_Type	d_type;
816223Sab196087		size_t		d_size;
826223Sab196087		off_t		d_off;		/* offset into section */
836223Sab196087		size_t		d_align;	/* alignment in section */
846223Sab196087		unsigned	d_version;	/* elf version */
856223Sab196087	} Elf_Data;
866223Sab196087
876223Sab196087As off_t is a signed type, these alternative versions would have to use
886223Sab196087an off64_t type instead.
896223Sab196087
906223Sab196087In addition to providing alternative large file aware Elf_Arhdr and
916223Sab196087Elf_Data types, it would be necessary to implement large file aware
926223Sab196087versions of the public functions that use them, also found in
936223Sab196087/usr/include/libelf.h:
946223Sab196087
956223Sab196087	/*
966223Sab196087	 * Function declarations
976223Sab196087	 */
986223Sab196087	unsigned  elf_flagdata(Elf_Data *, Elf_Cmd, unsigned);
996223Sab196087	Elf_Arhdr *elf_getarhdr(Elf *);
1006223Sab196087	off_t	  elf_getbase(Elf *);
1016223Sab196087	Elf_Data  *elf_getdata(Elf_Scn *, Elf_Data *);
1026223Sab196087	Elf_Data  *elf_newdata(Elf_Scn *);
1036223Sab196087	Elf_Data  *elf_rawdata(Elf_Scn *, Elf_Data *);
1046223Sab196087	off_t	  elf_update(Elf *, Elf_Cmd);
1056223Sab196087	Elf_Data  *elf32_xlatetof(Elf_Data *, const Elf_Data *, unsigned);
1066223Sab196087	Elf_Data  *elf32_xlatetom(Elf_Data *, const Elf_Data *, unsigned);
1076223Sab196087	Elf_Data  *elf64_xlatetof(Elf_Data *, const Elf_Data *, unsigned);
1086223Sab196087	Elf_Data  *elf64_xlatetom(Elf_Data *, const Elf_Data *, unsigned);
1096223Sab196087
1106223Sab196087It is important to note that these new versions cannot replace the
1116223Sab196087original definitions. Those must continue to be available to support
112*7833SRod.Evans@Sun.COMnon-large-file-aware programs. These new types and functions would be in
1136223Sab196087addition to the pre-existing versions.
1146223Sab196087
1156223Sab196087When you make code like this large file aware, it is necessary to undertake
1166223Sab196087a careful analysis of the code to ensure that all the surrounding code uses
1176223Sab196087variable types large enough to handle the increased range. Hence, this work
1186223Sab196087is more complicated than simply supplying variants that use a bigger
1196223Sab196087off_t and rebuilding --- that is just the first step.
1206223Sab196087
1216223Sab196087There are two standard preprocessor definitions used to control
1226223Sab196087large file support:
1236223Sab196087
1246223Sab196087	_LARGEFILE64_SOURCE
1256223Sab196087	_FILE_OFFSET_BITS
1266223Sab196087
1276223Sab196087These preprocessor definitions would be used to determine whether
1286223Sab196087a given program linked against libelf would see the regular, or
129*7833SRod.Evans@Sun.COMthe large file aware versions of the above types and routines.
1306223Sab196087This is the same approach used in other large file capable software,
1316223Sab196087such as libc.
1326223Sab196087
1336223Sab196087Finally, all the applications that rely on libelf would need to be made
1346223Sab196087large file aware. As with libelf itself, there is more to such an effort
1356223Sab196087than recompiling with preprocessor macros set. The code in these
1366223Sab196087applications would need to be examined carefully. Some of these programs
1376223Sab196087are very old, and were not originally written with such type portability
1386223Sab196087in mind. Such code can be difficult to transition.
1396223Sab196087
1406223Sab196087To work around the 2GB limit in 32-bit libelf:
1416223Sab196087
1426223Sab196087    - The fundamental limits of a 32-bit address space mean
1436223Sab196087      that a program this large should be 64-bit. Only a 64-bit
1446223Sab196087      address space has enough room for that much code, plus the
1456223Sab196087      stack and heap needed to do useful work with it.
1466223Sab196087
1476223Sab196087    - The 64-bit version of libelf is also able to process
1486223Sab196087      32-bit objects, and does not have a 2GB file size limit.
1496223Sab196087      Therefore, the 64-bit link-editor can be used to build a 32-bit
1506223Sab196087      executable which is >2GB. The resulting program will consume over
1516223Sab196087      half the available address space just to start running. However,
1526223Sab196087      there may be enough address space left for it to do useful work.
1536223Sab196087
1546223Sab196087      Note that the 32-bit limit for sharable objects remains at
1556223Sab196087      2GB --- imposed by the runtime linker, which is also not large
1566223Sab196087      file aware.
157