DOL – Delete Oldest Logs

This is a little program I made some time ago because I had some problems with some log files I generated. I didn’t know how big they would end up and I had a limited disk space. As you can guess, it can be used for any files. So I could have called it “DOF – Delete Oldest Files”.

The idea of this tiny .Net program is to delete the oldest logs first. It scans every files and delete as many files as required to reach its objectives.

Command line arguments (-h option) :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
DOL - Delete Oldest Logs (first)
Arguments are :
===============
-d          : Directory where to analyse files
-dr         : Directory where to analyse files recursively
-de         : Delete empty dirs
-sm         : Sort files by modification (default)
-sc         : Sort files by creation
-sa         : Sort files by access
-ds <size>  : Delete oldest logs to reach a total size of <size> (in bytes)
-dn <nb>    : Delete oldest logs to reach a total number of <nb>
-v          : Switch to verbose mode
 
Numbers can have k,m,g :
1k = 1 024         bytes
1m = 1 048 576     bytes
1g = 1 073 741 824 bytes

So, on Windows, you will probably do somethind like :

1
dol.exe -de -dr C:\temp -ds 2g

On Linux, it’s the same thing with a “mono” before.

1
mono dol.exe -dr /var/log/apache -ds 1g

You can download it here (13.5 KB).

Sharepoint – Clean a huge Document Library

I recently had to clean a huge (113 000 rows) document library. The first question that must come in mind is : Why was it so big ? Well, someone thought it was beter to store data in lots of XML files instead of in a classic list.

The second question is : How ?

First, you have to understand that you can’t use a DeleteAll() method, it doesn’t exist. You must fetch data by little packet of rows. If you try to get everything, you will have a OutOfMemoryException.

I can’t give you the final app as I am not sure that I have the right to, but I can give you the core sourcecode :

First of all, you want to avoid the memory exception problem. So you will cut your results in little packets of 100 rows, that’s pretty simple :

1
2
3
4
5
6
// We get everything but we limit the result to 100 rows
SPQuery q = new SPQuery();
q.RowLimit = 100;
 
// We get the results
SPListItemCollection coll = list.GetItems( q );

Then, what you could do is delete each item one by one :

1
2
3
foreach( SPListItem item in cColl ) {
   item.Delete();
}

But it’s soooo freaking slow. On a production server, we had something like 1 delete per second.

What you need to use is a CAML batch delete, this method builds a CAML delete batch :

1
2
3
4
5
6
7
8
9
10
11
12
private static String BuildBatchDeleteCommand( SPList list, SPListItemCollection coll ) {
	StringBuilder sbDelete = new StringBuilder();
	sbDelete.Append( "<?xml version=\"1.0\" encoding=\"UTF-8\"?><Batch>" );
 
	// We prepare a String.Format with a String.Format, this is why we have a {{0}}
	string command = String.Format( "<Method><SetList Scope=\"Request\">{0}</SetList><SetVar Name=\"ID\">{{0}}</SetVar><SetVar Name=\"Cmd\">Delete</SetVar><SetVar Name=\"owsfileref\">{{1}}</SetVar></Method>", list.ID );
	foreach ( SPListItem item in coll ) {
		sbDelete.Append( string.Format( command, item.ID.ToString(), item.File.ServerRelativeUrl ) );
	}
	sbDelete.Append( "</Batch>" );
	return sbDelete.ToString();
}

With that method, we could delete 100 rows in 2/3 seconds. Still, you can’t expect magic with Sharepoint, everything around Sharepoint is slow.

The final main code should look a little bit like that :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
// While there's something left
while ( list.ItemCount > 0 ) {
 
	// We get everything but we limit the result to 100 rows
	SPQuery q = new SPQuery();
	q.RowLimit = 100;
 
	// We get the results
	SPListItemCollection coll = list.GetItems( q );
 
	// We process a CAML batch
	String batch = BuildBatchDeleteCommand( list, coll );
 
	// We execute it
	web.ProcessBatchData( batch );
 
	list.Update();
}

Note SPWeb::ProcessBatchData returns a String, if your code doesn’t work, it could help you find out why. This is the only way, because a wrong batch won’t throw any exception. Keep that in mind.

Sharepoint : What should I Dispose ?

When you begin with Sharepoint like it did (and still do), you will ask your self : “What object should I dispose ?”. It’s quite important because Sharepoint leaks approximatively 1 Mb per IDisposable object not disposed. If it’s a menu, it could quickly make you loose 10 Mb per loaded page.

The best and complete answer is in the MSDN. But it’s a pretty long answer.

The short answer is :
In your webpart, you should dispose every SPWeb and SPSite you use except :

  • SPContect.Current.Site
  • SPContext.Current.Web
  • SPContext.Current.Site.RootWeb

In your features, you should dispose every SPWeb and SPSite you use except the ones given in your “properties” variable.

The reason is that Sharepoint gives these objects to every component it will launch, if you dispose one of these objects, the next component loaded by sharepoint has a good chance to crash. And it’s not always easy to debug as it’s the NEXT component which will crash also it’s the previous one that messed up one of the sharepoint context variables.

If your not sure that you will Dispose an object you should, you shoud check. The following code Disposes the SPWeb passed as argument and every parent except the last one.

1
2
3
4
5
6
7
8
9
10
11
12
public static SPWeb SPWebParentWebGenerationDp( SPWeb web, int generation ) {
	for ( int i = 0; i < generation; ++i ) {
		SPWeb toDispose = web;
 
		if ( web != null )
			web = web.ParentWeb;
 
		if ( toDispose != null && toDispose != SPContext.Current.Web )
			toDispose.Dispose();
	}
	return web;
}

Automatic error reporting in PHP

I edited this page on the 21 March 2010 because a lot of people seem interested and the code as since improved !

PHP has a pretty interesting feature, you can define a callback method to “catch” any error “thrown” in your code. And I’m sure most of you don’t use it. It’s really usefull when you want to make sure to detect error before any user reports it (which can takes time). This is all about avoiding to demolish with some lame errors your “user experience”.

I now use it in each of my index.php pages (which generally loads every other pages), but to speed things up I make it load the actual method only when the error is “catched”.

This is the code :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
function lightErrorHandler($errno, $errstr, $errfile, $errline, $shutdown = false ) {
	// When called from the shutdown function, the relative path doesn't work anymore.
	// You have to load the errorHandler function from its absolute path
	// If you don't like that method, you can always preload this function.
	require_once('/home/website/mysite.com/dev-www/include/error/errorHandler.inc.php');
	return errorHandler($errno, $errstr, $errfile, $errline, $shutdown);
}
set_error_handler( 'lightErrorHandler', E_ALL ^ E_NOTICE);
 
function lightExceptionHandler( $exception ) {
	require_once('./include/error/exceptionHandler.inc.php');
	return exceptionHandler( $exception );
}
set_exception_handler( 'lightExceptionHandler' );
 
function shutdown_function() {
	if(is_null($e = error_get_last()) === false && $e['type'] & (E_ALL ^ E_NOTICE) ) {
		lightErrorHandler( $e['type'], $e['message'], $e['file'], $e['line'], true );
	}
}
register_shutdown_function('shutdown_function');

include/error/errorHandler.inc.php :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
function errorHandler($errno, $errstr, $errfile, $errline, $shutdown) {
	global $engine;
 
	$tab = array(
	        'no'    => $errno,
	        'str'   => $errstr,
	        'file'  => $errfile,
	        'line'  => $errline
	);
 
	$message = 'An error happened :'."\n\n".'Error : '."\n".print_r( $tab, true )."\n\n".'StackTrace : '."\n\n".print_r( debug_backtrace(), true )."\r\n".'Memory state : '."\n".print_r( $GLOBALS, true )."\n";
 
	mail(
	        'email@company.com',
	        'MyProject : Error : '.$errfile.':'.$errline,
	        $message
	);
 
	$target = $errfile.':'.$errline;
 
	if ( ! $engine['bug'] && ! $shutdown ) {
		$engine['bug'] = true;
		Logger::log(array(
			'message'		=> 'Error : '.$errstr.' ('.$errno.')',
			'type'			=> 'error/codeError',
			'target'		=> substr( $target, 0-min(250, strlen( $target ))),
			'data'			=> serialize($GLOBALS),
			'level'			=> Logger::CRITICAL
		));
	}
 
	return false;
}

include/error/exceptionHandler.inc.php :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
function exceptionHandler( $exception ) {
	global $engine;
 
 
	$_SESSION['lastException'] = $exception;
 
	$message = 'An error happened :'."\n\n".'Error : '."\n".$exception."\n\n".'StackTrace : '.debug_backtrace()."\r\n".'Informations diverses : '."\n".print_r( $GLOBALS, true )."\n";
 
	mail(
	        'email@company.com',
	        'MyProject : Error : '.$exception->getFile().':'.$exception->getLine(),
	        $message
	);
 
	$target = $exception->getFile().':'.$exception->getLine();
 
	if ( ! $engine['bug'] ) {
		$engine['bug'] = true;
		Logger::log(array(
			'message'		=> 'An exception was thrown',
			'type'			=> 'error/exception',
			'target'		=> substr( $target, 0-min(250, strlen( $target ))),
			'data'			=> serialize($GLOBALS),
			'level'			=> Logger::CRITICAL
		));
	}
 
	return false;
}

debug_backtrace requires PHP 4.3 and set_error_handler only supports error types since PHP 5.0. So, if you plan on using this on a PHP 4.X host, you have to make sure your code doesn’t throw E_NOTICE errors. My code is never E_NOTICE error safe.

Insert SVN version and Build number in your C# AssemblyInfo file

Software version number is quite important. It helps you track what versions have your users when they report something. And when it’s linked to an SVN version number, it’s even better.

Well, with MSBuild Community Task, you can easily automatically generate smart version numbers, you have to:

  • Download MSBuildCommunityTasks
  • Make sure your “svn.exe” binary is in C:\program files\subversion\bin
  • Add this at the end of your .csproject file :

2011-07-02 update: As given in Markus comment, this code is a much better option:

1
2
3
4
5
6
7
8
9
10
11
12
13
<!-- Import of the MSBuildCommunityTask targets -->
<Import Project="$(MSBuildExtensionsPath)\MSBuildCommunityTasks\MSBuild.Community.Tasks.Targets" />
 
  <!-- to AssemblyInfo to include svn revision number -->
<Target Name="BeforeBuild">
	<SvnVersion LocalPath="$(MSBuildProjectDirectory)" ToolPath="$(ProgramFiles)\VisualSVN\bin">
	   <Output TaskParameter="Revision" PropertyName="Revision" />
        </SvnVersion>
 
	<FileUpdate Files="Properties\AssemblyInfo.cs"
                Regex="(\d+)\.(\d+)\.(\d+)\.(\d+)"
                ReplacementText="$1.$2.$3.$(Revision)" />
</Target

You should only have a “</Project>” field left…

Then, you just have to open your project and build your project, it will fail once (missing version.txt file) and then work forever. This will generate your Assembly & AssemblyFile versions like this: Major.Minor.SvnVersion.BuildVersion

In your C# code, to get your version, you just have to add something like that:

1
2
3
4
5
public static String Version {
  get {
    return System.Reflection.Assembly.GetExecutingAssembly().GetName().Version.ToString();
  }
}