Download Reference Manual
The Developer's Library for D
About Wiki Forums Source Search Contact

Ticket #1071 (closed defect: fixed)

Opened 4 months ago

Last modified 1 month ago

stdc.posix.sys.stat.stat_t is wrong on Linux x86_64

Reported by: e-t172 Assigned to: sean
Priority: critical Milestone: 0.99.8
Component: Tango Version: trunk
Keywords: triage Cc:

Description

When using the stat_t structure on a x86_64 Linux system (GDC r199 on GCC 4.1.2), I noticed all the members were messed up : gid in the uid member, uid in the nlink member, etc.

According to the glibc headers (<bits/stat.h>), "struct stat" has a different shape depending on the word size. tango.stdc.posix.sys.stat does not reflect this.

This kind of problem might also occur in other parts of tango.stdc.

Attachments

tango.stat.patch (5.4 kB) - added by e-t172 on 04/27/08 11:42:58.
Fix to the stat_t structure
tango.stat.v2.patch (5.2 kB) - added by e-t172 on 05/04/08 04:03:21.
Slightly better patch
tango.stat.v3.patch (5.1 kB) - added by e-t172 on 05/18/08 11:16:18.
Third try

Change History

04/27/08 11:42:58 changed by e-t172

  • attachment tango.stat.patch added.

Fix to the stat_t structure

04/27/08 11:45:24 changed by e-t172

Be aware that the attached patch has not been tested on 32-bit. One should check for possible regressions on this platform.

04/27/08 12:14:03 changed by larsivi

  • milestone set to 0.99.7.

We have known of the problem for a while, just not had the resources to find a proper solution. Thanks!

04/27/08 12:16:20 changed by larsivi

  • owner changed from kris to sean.
  • priority changed from major to critical.

Will leave to Sean to check the patch.

04/27/08 12:16:53 changed by larsivi

#720 was marked as a duplicate of this.

05/03/08 20:39:22 changed by sean

Is this true of both the default and the USE_FILE_OFFSET64, USE_LARGEFILE64 blocks? It's true that the Tango linux header modules don't match the linux headers exactly, but that's intentional. Please note that I would expect the two above flags to be true on a 64-bit linux system, or any 32-bit system which has a 64-bit filesystem installed.

Regarding this patch... I noticed that it seems to misunderstand the use of c_long, for example, as well as the basic idea behind the current linux header module design. I'd love to have a patch that sorts everything out with this in mind, but as it is I'm afraid I can't apply the patch in this ticket. However, I will go through these files again and see if I can spot any mistakes based on the linux headers I have available.

05/04/08 03:49:10 changed by e-t172

USE_FILE_OFFSET64 and USE_LARGEFILE64 are not necessarily true on 64-bit systems. And even if you enable them in posix.config, it does not work either. In bits/stat.h parts of the stat structure depends on the wordsize, and I modified the Tango files to reflect this. Theoretically, the patched file should work with all combinations of word size, USE_FILE_OFFSET64, and USE_LARGEFILE64.

As of c_long: I closely replicated the type sizes from bits/typesizes.h, and it appears I didn't have to use c_long, because all the type sizes were independent of the word size (except for nlink_t and ssize_t, my mistake). I differentiated off_t and off64_t (for instance) because it was necessary: there are cases where off_t and off64_t are used together in the stat structure.

05/04/08 04:03:21 changed by e-t172

  • attachment tango.stat.v2.patch added.

Slightly better patch

05/11/08 17:35:42 changed by larsivi

  • keywords set to triage.

05/17/08 18:04:29 changed by icy

I applied the patch and ran a quick test:

FilePath?("test.d").fileSize returns 334 which also ls returns, so it seems to work. 2.6.24-16-generic on amd64 with ext3 fs.

More extensive testing is needed but so far it looks really good.

05/18/08 10:01:15 changed by sean

Thanks, this looks better. I haven't gone through the whole thing yet, but it seems like there may still be a few mistakes however. For example, suseconds_t is defined as a "long int" in my version of linux, which would make it c_long. Why the change? Also, why add fsblkcnt64_t if it's the same type as fsblkcnt_t? Finally, have you checked these changes with stat64() and related routines? I was going to add aliases for those as a part of this change as well. ie.

  static if( __USE_LARGEFILE64 )
  {
    dirent* readdir64(DIR*);
    alias   readdir64 readdir;
  }
  else
  {
    dirent* readdir(DIR*);
  }

And so on. The original intent was to avoid the creation of xxx64_t aliases entirely and use USE_LARGEFILE64 to switch between 32 and 64-bit file support. From what I can tell, while USE_FILE_OFFSET64 is an independent value it must always be set the same as USE_LARGEFILE64 for any given system--it doesn't make sense to support 64-bit files but not 64-bit file offsets.

05/18/08 11:10:22 changed by e-t172

suseconds_t is defined as a "long int" in my version of linux, which would make it c_long. Why the change?

My mistake. I must have misread bits/types.h.

why add fsblkcnt64_t if it's the same type as fsblkcnt_t?

Again, my mistake. fsblkcnt_t should be c_ulong.

Finally, have you checked these changes with stat64() and related routines?

I didn't test stat64. It should work. Theoretically.

And so on. The original intent was to avoid the creation of xxx64_t aliases entirely and use USE_LARGEFILE64 to switch between 32 and 64-bit file support.

Here is an example: if USE_LARGEFILE64 is not set and USE_FILE_OFFSET64 is set, the stat structure contains a ino_t field and a ino64_t field. That's why we need both xxx_t and xxx64_t. Unless, of course, you find a more elegant solution which preserves the alignment and field sizes.

For suseconds_t and fsblkcnt_t, I'll upload a new patch right away.

05/18/08 11:16:18 changed by e-t172

  • attachment tango.stat.v3.patch added.

Third try

05/18/08 19:02:35 changed by sean

  • status changed from new to assigned.

Fair enough. I'll roll in the changes this week.

05/19/08 04:03:28 changed by e-t172

One should check the patched structure on 32-bit first, just to make sure there isn't any regressions.

06/15/08 15:24:33 changed by sean

  • status changed from assigned to closed.
  • resolution set to fixed.

(In [3619]) Applied v3 patch from #1071. I will make changes to implicitly call the foo64 version of the functions separately. This closes #1071

06/15/08 15:36:40 changed by kris

Does this enable big files on a 32bit O/S also, sean?

06/15/08 18:33:52 changed by larsivi

The latter question is probably more about configuration than anything else. If the O/S supports it, a define will be set in the system's C compiler which will trigger the support in the relevant headers. Of course, cross compiling is also possible, in which case it is up to the cross-compiler to set this. What does this mean? The D compiler should preferably know the same thing, something that probably is possible with GDC, but not really DMD, and set a corresponding D version identifier. Since this sounds a tad unlikely, the only safe solution is to compile a C program and see if the define is set, and if so enable the USE_LARGEFILE64 version in the D compile line.

We can probably provide the relevant files to help with this for those who needs to make it part of their build process. The other way forward is to make compilers (and/or their configurations) easy to customize for this. For DMD, we could in our install scripts compile the C program, check if the define is set, and if it is, add the version to dmd.conf. I think it should be possible to do the same with the GDC spec file and so on. It would be swell if GDC and LLVMDC for instance could do this on their own based on the C compiler's spec file or something.

So, all-in-all, a problem that mostly needs to be solved outside of our D source files (although the correct versioning will need to be applied), but it should be possible to do so, although in some cases probably manually.

Just to make this more difficult, foo64 is not standard POSIX, and so various O/S may have different ways of providing access to foo64, some systems seems to really forward this to foo(2) and so on. It would be swell if this can be collected somewhere that makes it easy to mantain and expand (for more O/S'es)

06/15/08 21:11:42 changed by sean

Almost, Kris. Linux has separate functions for 64-bit operations, stat64(), etc, each which accept the 64-bit version of the struct. Then, depending on your configuration, they use some #define magic to have calls to stat() actually call stat64(). What I still need to do is go through the posix headers and do something similar with conditional aliasing, so 64-bit file support works automatically on linux platforms that support it. This last commit was about getting the support structs right. Apparently, there were some mistakes in how some of those structs were defined.

06/17/08 02:55:24 changed by e-t172

  • status changed from closed to reopened.
  • resolution deleted.

My patch introduces regressions. We need to fix them. See #1149.

06/17/08 03:11:55 changed by larsivi

Yes, this should have been reopened yesterday. FWIW, the patch is rolled back for now.

07/10/08 07:04:47 changed by larsivi

  • milestone changed from 0.99.7 to 0.99.8.

Any news?

07/25/08 14:21:03 changed by sean

  • status changed from reopened to closed.
  • resolution set to fixed.

(In [3798]) Verified 64-bit data definitions for linux and added 64-bit function prototypes for linux. These may enabled by setting "USE_LARGEFILE64 = true" in tango.stdc.posix.config. Untested on 64-bit linux, so all feedback welcome. This closes #656 and closes #1071