Download Reference Manual
The Developer's Library for D
About Wiki Forums Source Search Contact

Ticket #1702 (new enhancement)

Opened 15 years ago

Last modified 14 years ago

Patch for exposing gc_stats() to the public + GCStats enhancements

Reported by: winterwar Assigned to: community
Priority: trivial Milestone: 1.0
Component: Tango Version: trunk
Keywords: patch Cc:

Description

The function gc_stats() in gc.d from the Tango runtime is not exposed to the user. This patch exposes this function as GC.stats() in tango.core.Memory. The GCStats struct is made public, too.

Additionally, the GCStats struct is extended by 3 additional members: gc_count tells the user how many GC cycles happened so far, mark_time how long the GC spent in the mark phase, and sweep_time how long it spent in the sweep phase. The changes in gxc.d are to fill those new members.

(Both things are kind of unrelated, but I didn't split the patch/tickets because they are so trivial.)

Patch against Tango svn follows:

Index: lib/common/tango/core/Memory.d =================================================================== --- lib/common/tango/core/Memory.d (Revision 4762) +++ lib/common/tango/core/Memory.d (Arbeitskopie) @@ -47,6 +47,20 @@

extern (C) void gc_removeRoot( void* p ); extern (C) void gc_removeRange( void* p );

+ + struct GCStats + { + size_t poolsize; // total size of pool + size_t usedsize; // bytes allocated + size_t freeblocks; // number of blocks marked FREE + size_t freelistsize; // total of memory on free lists + size_t pageblocks; // number of blocks marked PAGE + size_t gc_count; // number of GC runs + ulong mark_time; // microseconds spend in mark-phase + ulong sweep_time; // microseconds spend in sweep-phase + } + + extern (C) GCStats gc_stats();

}

@@ -502,4 +516,11 @@

newcap = newext > newcap ? newext : newcap; // just to handle overflows return newcap;

}

+ + alias .GCStats GCStats; + + static GCStats stats() { + return gc_stats(); + }

}

+ Index: lib/gc/basic/rt/basicgc/gcx.d =================================================================== --- lib/gc/basic/rt/basicgc/gcx.d (Revision 4762) +++ lib/gc/basic/rt/basicgc/gcx.d (Arbeitskopie) @@ -55,6 +55,7 @@

debug(THREADINVARIANT) private import tango.stdc.posix.pthread; debug(PRINTF) private import tango.stdc.posix.pthread : pthread_self, pthread_t; debug private import tango.stdc.stdio : printf;

+private import tango.time.StopWatch? : StopWatch?;

version (GNU) {

@@ -1337,6 +1338,10 @@

stats.poolsize = psize; stats.usedsize = bsize - flsize; stats.freelistsize = flsize;

+ + stats.gc_count = gcx.stats_gc_count; + stats.mark_time = gcx.stats_mark_time; + stats.sweep_time = gcx.stats_sweep_time;

}

}

@@ -1434,7 +1439,12 @@

List *bucket[B_MAX]; // free list for each size

+ // see rt.basicgc.gcstats.GCStats + size_t stats_gc_count; + ulong stats_mark_time; + ulong stats_sweep_time; +

void initialize() { int dummy;

@@ -2262,6 +2272,11 @@

debug(COLLECT_PRINTF) printf("Gcx.fullcollect()\n");

+ stats_gc_count++; + + StopWatch? timer; + timer.start(); +

thread_suspendAll();

p_cache = null;

@@ -2394,6 +2409,9 @@

thread_resumeAll();

+ stats_mark_time += timer.microsec; + timer.start(); +

// Free up everything not marked debug(COLLECT_PRINTF) printf("\tfree'ing\n"); size_t freedpages = 0;

@@ -2551,6 +2569,8 @@

}

}

+ stats_sweep_time += timer.microsec; +

debug(COLLECT_PRINTF) printf("recovered pages = %d\n", recoveredpages); debug(COLLECT_PRINTF) printf("\tfree'd %u bytes, %u pages from %u pools\n", freed, freedpages, npools);

Index: lib/gc/basic/rt/basicgc/gcstats.d =================================================================== --- lib/gc/basic/rt/basicgc/gcstats.d (Revision 4762) +++ lib/gc/basic/rt/basicgc/gcstats.d (Arbeitskopie) @@ -35,4 +35,7 @@

size_t freeblocks; // number of blocks marked FREE size_t freelistsize; // total of memory on free lists size_t pageblocks; // number of blocks marked PAGE

+ size_t gc_count; // number of GC runs + ulong mark_time; // microseconds spent in mark-phase + ulong sweep_time; // microseconds spent in sweep-phase

}

Attachments

patch (3.5 kB) - added by winterwar on 07/03/09 20:55:04.
the patch (attachments magically work now, sorry about the inline patch)
pa (7.6 kB) - added by winterwar on 07/12/09 15:00:12.
new patch
Memory.patch (0.9 kB) - added by doob on 05/24/10 22:20:24.
Patch for the latest issue
gc_monitor.patch (1.1 kB) - added by mwarning on 05/27/10 15:11:42.

Change History

07/03/09 20:55:04 changed by winterwar

  • attachment patch added.

the patch (attachments magically work now, sorry about the inline patch)

07/03/09 22:47:41 changed by kris

  • owner changed from kris to fawzi.

Thanks. What do you think, Fawzi?

07/04/09 16:56:05 changed by fawzi

I don't agree with the choice of making the gc_stat struct public, the kind of info available is quite generic, it is still dependent on the gc algorithm, and the access to it is not time critical, so I would like to see an interface with a string based interface (hash like), and ways to check the presence of a given string.

Total allocation/deallocation and total time in gc could be generic for all GCs.

With those changes I have no problems with the patch

07/05/09 23:54:42 changed by winterwar

Fawzi, what exactly do you have in mind?

Here are two ideas about an interface, that isn't fixed to a struct with fixed members:

1. ulong gc_get_stat(char[] stat_name);

(can return 0 or ulong.max to signal non-presence of a stat_name)

2. ulong[char[]] gc_get_stats();

Same as 1., but return all present values in an associative array.

3. struct StatEntry { char[20] name; ulong value; } StatEntry[] gc_get_stats(StatEntry[] prealloc);

Basically the same interface again. The user has to pass a preallocated array in prealloc, and the function returns a slice of this array.

But that isn't without problems:

For 1, you need several calls to gc_get_stat() to get all statistics, but then the returned values could be inconsistent with each other (e.g. because a garbage collection happened between two calls). I don't think this is really a problem, though.

For 2, gc_get_stats() needs to allocate memory, which might be a problem.

3 avoids memory allocation, but is a bit complicated and unintuitive for the user.

Please tell me what to do. I think if I don't push this, nothing will ever happen. It's so trivial, yet annoying.

07/06/09 08:40:48 changed by fawzi

I was thinking to use a class that returns the gc info, and implements real opIndex(char[]) and bool opIn(char[]) and dup. There should be a global instance that gives the info for the current gc. i.e something like

class GCInfo{
 static GCInfo instance;
 static GCInfo opCall(){ return instance; }
 real opIndex(char[]k){ assert(false,"unsupported key "~k); }
 bool opIn(char[]k){ return false; }
 GCInfo dup(){ return this; }
}

private class BasicGCInfo: GCInfo{
//...
}

static this(){
  GCInfo.instance=new BasicGCInfo();
}

doing it this way, support for other GC would be trivial without changing the interface

About the extra info on gc runs that is part of the patch, that is welcome.

07/12/09 15:00:12 changed by winterwar

  • attachment pa added.

new patch

08/04/09 10:25:51 changed by fawzi

Sorry I just looked in detail at the patch, and I don't like having tango-base depending on Variant (through the extern(C) function).

Sometime Variant does not compile, or has problems on some platform/compilers, making tango-base depend on it looks like a bad choice. I am sorry I didn't realize the issue earlier, I think that maybe using a real is still the best solution...

08/25/09 18:44:19 changed by fawzi

in r4880 GCStats are user accessible

11/29/09 12:59:51 changed by fawzi

  • owner changed from fawzi to kris.

this will have to be done again in some way with the new directory struct

01/12/10 07:48:47 changed by kris

  • milestone changed from 0.99.9 to 1.0.

05/22/10 19:17:42 changed by kris

  • status changed from new to closed.
  • resolution set to fixed.

(In [5465]) fixes #1702 :: Patch for exposing gc_stats() to the public + GCStats enhancements

made some changes to support the invocation of user-defined delegates at the beginning and end of a full-collection

Thanks to wintervar

05/24/10 13:28:10 changed by doob

The changset [5465] caused some problems when building tango as a dynamic library on mac. I'm getting errors about a missing symbol.

Undefined symbols:

"_D5tango4core6Memory10gc_monitorFDFZvDFiiZvZv", referenced from:

_D5tango4core6Memory2GC7monitorFDFZvDFiiZvZv in tango-coreMemory-release.o

ld: symbol(s) not found collect2: ld returned 1 exit status

The problem is the newly added "gc_monitor" in tango.core.Memory. Since it's declared extern (D) it will include the module in the mangled name. But "gc_monitor" is declared in rt.gc.basic.gc and that declaration will mangle the name to containing a different module name. Is it possible to declare the function as a C function?

(follow-up: ↓ 12 ) 05/24/10 19:00:26 changed by kris

  • status changed from closed to reopened.
  • resolution deleted.

oh drat ... I'd like to make it extern(C), but doing so won't accept delegates as arguments :|

Any idea how to fix the delegate issue? (change the decl to extern(C) and recompile to see the issue)

(in reply to: ↑ 11 ) 05/24/10 20:04:25 changed by doob

Replying to kris:

oh drat ... I'd like to make it extern(C), but doing so won't accept delegates as arguments :|

Yeah, I noticed that.

Any idea how to fix the delegate issue? (change the decl to extern(C) and recompile to see the issue)

An ugly solution is to declare the function as an extern (C) void pointer with the same name as the mangled name then cast it to a function pointer when calling it:

extern (C) void* _D2rt2gc5basic2gc10gc_monitorFDFZvDFiiZvZv;

static void monitor( void delegate() begin, void delegate(int, int) end )
{
    (cast(void function (void delegate(), void delegate(int, int)) gc_monitor)( begin, end );
}

Actually, I don't know if the above will run, it links anyway.

05/24/10 22:20:24 changed by doob

  • attachment Memory.patch added.

Patch for the latest issue

05/27/10 12:25:36 changed by Marenz

What compiles, too is

_gc.monitor (cast(void delegate())begin,cast(void delegate(int,int)) end);

but when trying to use the resulting .a to link I still get

undefined reference to `_D5tango4core6Memory10gc_monitorFDFZvDFiiZvZv'

--Maren

05/27/10 14:21:22 changed by Marenz

Okay, I fixed it now.

in Memory.d, change the decleration, so that it looks like

		alias extern(D) void delegate() ddel;
		alias extern(D) void delegate(int,int) dint;

		extern (C) void gc_monitor(ddel begin, dint end );

in gc.d:

alias extern(D) void delegate() ddel;
alias extern(D) void delegate(int,int) dint;

extern (C) void gc_monitor( ddel begin, dint end )
{
	_gc.monitor (begin,end);
}

(or you can define the aliases only in one file and import the file, doesn't matter)

--Marenz

05/27/10 15:10:00 changed by mwarning

Works me as well (dmd 1.060, tango trunk).

05/27/10 15:11:42 changed by mwarning

  • attachment gc_monitor.patch added.

05/30/10 17:53:25 changed by kris

  • status changed from reopened to closed.
  • resolution set to fixed.

(In [5477]) fixes #1702 :: Patch for exposing gc_stats() to the public + GCStats enhancements

kudos to doob & marenz

06/21/10 11:11:23 changed by winterwar

  • status changed from closed to reopened.
  • resolution deleted.

Sorry, that's not enough information you can get from the GC.

What is still missing: * get the number of bytes in use * get the number of free bytes (e.g. in freelists) * get the number of pools (this number seems kind of critical for GC performance, so you'd naturally be interested in it)

The solution is very simple: just expose gcstats.d and the gc_stats() function to the user.

07/17/10 18:10:23 changed by kris

  • status changed from reopened to new.
  • owner changed from kris to community.