<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-28784075</id><updated>2012-02-09T05:14:50.855-08:00</updated><title type='text'>Postcards from the Fridge</title><subtitle type='html'></subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://ezolt.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/28784075/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://ezolt.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>ezolt</name><uri>http://www.blogger.com/profile/06993911806502857507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>12</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-28784075.post-115620952296134030</id><published>2006-08-21T18:16:00.000-07:00</published><updated>2006-08-21T18:59:54.850-07:00</updated><title type='text'>Thanks for All the Fish</title><content type='html'>Well, today SOC draws to a close, so I decided to write up how things currently stand.  Thanks to all the Gnomers for letting me work on this!&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:130%;"&gt;Abstract&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;The point of this project was to investigate why nautilus's startup performance was slow, and to find different ways to fix the problem.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;Nautilus's startup is slow because it keeps blocking and allows other applications to run. &lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Sometimes it is lucky and completes its startup quickly.  Other times it isn't, and gnome-panel (and all of the plugins) or other startup applications get in the way.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:130%;"&gt;Problem definition&lt;/span&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;    minimize the amount of time between nautilus startup and mapping of the icons.&lt;/li&gt;&lt;li&gt;Remove the variability in the gnome startup times.&lt;/li&gt;&lt;/ol&gt;    I would argue that it is most important for the file manager to start up.  In particular the problem that I was solving (which is one that I see every time I start up any modern based Linux, be it on Fedora, Ubuntu or SUSE).  The X session starts, the panel shows up but it takes seconds for the icons to finally appear on the screen.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:130%;"&gt;&lt;span style="font-size:180%;"&gt;Tools&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;The general problem&lt;/span&gt;&lt;br /&gt;When I started this project, I had very little idea about the overall architecture of gnome.  When do various components start up? Which pieces were responsible for which parts of the system? There is documentation for some of the individual components (Nautilus's documentation is pretty good), but nothing that describes the overall startup of the gnome desktop. There are quite a few articles describing the exact startup sequences of Linux from the boot loader to the init scripts. It would be helpful if there was something similar for gnome, so it was at least clear how things were intended to work.&lt;br /&gt;&lt;br /&gt;The problem of understanding and analyzing the complex startup of gnome is actually a more general problem.  Gnome starts multiple threads, and multiple processes, each with interlocking dependencies.  A few general tools have been developed for Linux which help to understand these startup sequences.  No single tool seems to have all of the features needed to truly understand what is happening, so I tried a bunch of different tools.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;1) Bootchart (&lt;a href="http://www.bootchart.org/"&gt;http://www.bootchart.org/&lt;/a&gt;)&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;The first tool that I used to determine how exactly things were starting up was the "bootchart" tool.  This tool was originally designed to track the startup of programs in the Linux boot process.  Fortunately it is also flexible enough to monitor and track any other set of processes that are started at a later date.  It gives a nice graphical chart of when various processes started, and how much each used the cpu during the specified time period.&lt;br /&gt;&lt;br /&gt;The initial bootchart images were the first indication that something was unusual about nautilus startup.  Different charts of the same gnome startup sequence provided a radically different view when things started up, and how longs things took to start.  Sometimes the panel would begin first, and sometimes the gnome-settings-daemon would.  This was my first clue that some sort of race was happening.  If there wasn't a race, each startup should have an identical series of steps, with little to no variability.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:130%;"&gt;2) strace + 'access' hacks and Federico's visualization tool.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Next I started to use the analysis tools and techniques that others had used before me.  In particular, I used strace to generate, and Federico's tool to analyze the result.  This provided the most complete data. I've detailed in an earlier blog entry how to get it all started up.&lt;br /&gt;&lt;br /&gt;Once thing I haven't mentioned yet, is that I wrote a small python script to sort through the strace results. This will replace "pids" with the actual name of the processes.  This makes reading the strace results much easier. (It will trace clones, forks and execs, so you can see exactly what is running when...)&lt;br /&gt;&lt;br /&gt;#!/usr/bin/python&lt;br /&gt;import fileinput&lt;br /&gt;import string&lt;br /&gt;&lt;br /&gt;pid_to_name={}&lt;br /&gt;&lt;br /&gt;thread=1&lt;br /&gt;&lt;br /&gt;for line in fileinput.input():&lt;br /&gt;__pid = line.split()[0]&lt;br /&gt;&lt;br /&gt;__if pid_to_name.has_key(pid):&lt;br /&gt;____start =line.find(" ")&lt;br /&gt;____print pid_to_name[pid] + " " +line[start:-1]&lt;br /&gt;__  else:&lt;br /&gt;____print line[:-1]&lt;br /&gt;&lt;br /&gt;__if "execve(" in line:&lt;br /&gt;____param = line.split()[2]&lt;br /&gt;____pid_to_name[pid] = param.split("\"")[1]&lt;br /&gt;&lt;br /&gt;__if "clone(" in line:&lt;br /&gt;____new_name = pid_to_name[pid] + (".%d"%thread)&lt;br /&gt;____thread = thread + 1&lt;br /&gt;____new_pid = line.split("=")[-1].strip()&lt;br /&gt;____pid_to_name[new_pid] = new_name&lt;br /&gt;&lt;br /&gt;__  if "&lt;... clone resumed&gt;" in line:&lt;br /&gt;____new_name = pid_to_name[pid] + (".%d"%thread)&lt;br /&gt;____thread = thread + 1&lt;br /&gt;____new_pid = line.split("=")[-1].strip()&lt;br /&gt;____pid_to_name[new_pid] = new_name&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:130%;"&gt;3) NPTL Trace Tool (http://nptltracetool.sourceforge.net/usm_example.php)&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Next, I wanted to try to visualized how the different thread interacted with the hope that I would be able to see some clear patterns as to why nautilus was continually yielding the CPU.&lt;br /&gt;&lt;br /&gt;The NPTL trace tool looked like promising avenue, because it had a graphical interface (using ntt_paje), and it was able to analyze a the complex interaction of threaded applications.  It can provide a visualization, as well as statistics about the amount of locks held, and report on the various pthread calls that were made.&lt;br /&gt;&lt;br /&gt;The trace tool produced the results into a format the &lt;a href="http://www-id.imag.fr/Logiciels/paje/"&gt;Paje&lt;/a&gt; visualization tool could open and analyze.  Paje itself is a little clunky and requires that the GNUstep environment is installed. I was able to capture a trace of a generic application, but unfortunately when I tried to run the trace tool an nautilus, it seg faulted.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:130%;"&gt;4) frysk (&lt;a href="http://sourceware.org/frysk/"&gt;http://sourceware.org/frysk/&lt;/a&gt;)&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;This was another tool that promised to make visualization and debugging of complicated threading applications/environments easy to do.  It, unfortunately, is still immature.  When I tried to monitor system calls that nautilus made, it also crashed.  However, this project is one to watch.  It provides a graphical interface analyze how, and allows you to add arbitrary rules to analyze how a process and set of threads were executing.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:130%;"&gt;5) Ideal tool&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;The ideal tool would be one that shows both when applications are running, and why they vacate the CPU when they are finished. Strace is really a poor man's version of this tool. However, it doesn't have any way to indicate when the scheduler decided to swap the running process, so if an application spends all of its time in user space, the strace method will never detect that the running application has changed.&lt;br /&gt;&lt;span style="font-size:130%;"&gt;&lt;br /&gt;6) Dishonorable mention&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Two tools that I tried to use, but were unsuccessful were &lt;a href="http://developer.amd.com/downloads.aspx"&gt;AMD's code analyst &lt;/a&gt;and &lt;a href="http://www.intel.com/cd/software/products/asmo-na/eng/vtune/vlin/239143.htm"&gt;Intel's VTune.&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;AMD get props for releasing their Code Analyist as GPL.  It provides a GUI interface to oprofile, and some other niceties.  Unfortunately, the number of distributions that it supports is rather small, and it wouldn't build on my Fedora 5 box.  Even worse (in my opinion) was Intel's vtune. They provided a binary which could be used on Linux for non-commercial profiling.  However, once it detected that my processor was an AMD chip, it wouldn't let me use it.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:180%;"&gt;Experiments: Summary&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Since it may have been difficult to follow what happened over the course of the summer, I'm going to summarize my experiments, and show how I came to the conclusions that I did.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:130%;"&gt;Setup&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;First, my goal was to work with "warm-booting" nautilus, after everything had been loaded into the disk cache.  Second, I automated the startup and collection, so that I could run the same test repeatedly, and then analyze the results by either looking at visualizations, or running the tests through gnumeric.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:130%;"&gt;Goals&lt;br /&gt;&lt;/span&gt;My goals were to remove the variability in the nautilus startup, and minimize the overall nautilus startup time.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:130%;"&gt;Test 1  Full startup&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;I added tracing information which started at the first line of "main" in nautilus, and stopped when the icons were painted on the screen. (in icon_container expose_event).&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:130%;"&gt;Result:&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;The startup times were highly variable, and other processes (such as those launched by gnome-panel) would run in between the startup of nautilus and the drawing of the icons.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:130%;"&gt;Test 2 Full startup w/a high gnome-session priority&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;I edited the ".gnome/gnome-session" file, and set the priority of nautilus to 10, which was higher than that of gnome-panel.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:130%;"&gt;Result:&lt;br /&gt;&lt;/span&gt;No change.  Nautilus was STILL interrupted by gnome-panel processes.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:130%;"&gt;Test 3 Launch simply (no wm, no gnome-panel)&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;Next, I removed the "wm" and "gnome-panel" from the session file, and retimed nautilus's startup.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:130%;"&gt;Result:&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;The nautilus startup times were still highly variable. Nautilus was being interrupted by the gnome-volume-manager, and the various daemons that were launched internally by gnome-session. (gnome-keyring)&lt;br /&gt;&lt;br /&gt;One interesting part of this investigation was that sometimes "gnome_vfs_get_volume_monitor()" took and long time to start, and other times it was very quick.  Federico had seen this, and he explained what was happening. Various gnome-daemons/services are launched on demand, and this call was causing this to take place.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:130%;"&gt;Test 3 Launch simply (no wm, no gnome-panel) w/a gnome-vfs preloader&lt;br /&gt;&lt;/span&gt;I wanted to see if nautilus startup times would because a little, if the gnome-vfs-daemon was prestarted before nautilus started to launch. I modified the disk mount applet to call the gnome-volume-manager, and&lt;br /&gt;exit.  Then, I modified the gnome-session-manager to call this "gnome-vfs" preloader, and wait for 2 seconds while it started.  The 2 second delay is an unacceptable "real" solution, but it was enough to see prestarting the daemon would increase startup time.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:130%;"&gt;Result:&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;The variability totally disappeared, and the startup times dropped to about 1/2 second.&lt;br /&gt;&lt;span style="font-size:130%;"&gt;&lt;br /&gt;Test 4 Launch complete session w/preloader.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;I decided to go for broke.  I had the preloader within gnome-session, so I decided to try with gnome-panel and the gnome-wm turned back on.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:130%;"&gt;Result:&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;unfortunately, the variability was back.  Upon closer investigation the applications were still interrupting nautilus between the time it was initially launched and the first icons were painted on the screen.&lt;br /&gt;&lt;br /&gt;Nautilus was yielding the cpu, and other applications (gnome-panel) were running before the nautilus startup could complete.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:130%;"&gt;Test 5 Launch complete session w/preloader &amp; niced gnome-panel.&lt;br /&gt;&lt;/span&gt;Now, I wanted to see what happened if I made the nautilus the highest priority process, and basically made sure that it was the highest priority.  However, since I was not able to easily boost its priority (without becomes root), I decided to drop the priority of the processes competing for the CPU. I ran the gnome-panel with a nice of 19.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:130%;"&gt;Result:&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;Nautilus's startup was very snappy, but the gnome-panel never seemed to be painted.  The variability was there because other applications (gnome-keyring) could still occasionally interrupt the startup.&lt;br /&gt;&lt;span style="font-size:130%;"&gt;&lt;br /&gt;&lt;span style="font-size:180%;"&gt;Next Steps&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;To really solve the startup problem, nautilus probably has to be rearchitected to stop blocking or it must somehow be made a higher priority than the other applications that startup.&lt;br /&gt;&lt;br /&gt;Here are some ideas on how to fix things:&lt;br /&gt;&lt;span style="font-size:130%;"&gt;0) Incorporate gnome-vfs startup in gnome-session&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;This should probably be done even if it doesn't completely solve the nautilus startup problem.  gnome-vfs is needed by the gnome desktop. Launching it "on-demand" only causes blocking and unnecessary&lt;br /&gt;contention. (I haven't tried this yet. )&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:130%;"&gt;1) Rearchitecture Nautilus to remove all of the yielding.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;This is a bigger change than I could do.  Even minimizing the number of times that it yields would reduce the chance that it would be interrupted.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:130%;"&gt;2) Remove the on-demand loading (or at least short circuit it, so it is never used in a properly running desktop. )&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;IMHO, the on-demand loading is a bad idea. Ideally all of the dependencies for an application would be running before it starts.  It would reduce the amount of times that applications block, and could&lt;br /&gt;possibly reduce thrashing as multiple applications wait for a component to start.&lt;br /&gt;&lt;span style="font-size:130%;"&gt;&lt;br /&gt;3) Modify nautilus to give gnome-session a thumbs up ONLY after the icons have painted.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Right now, gnome-session and the applications that it launch have a handshake that wait for an application to start. This response handshake would have to be modified to only respond after the icons have been painted.&lt;br /&gt;&lt;br /&gt;I've been working on getting this to work, but I can't figured out where the handshake takes place.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:130%;"&gt;4) Boost nautilus's priority&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;If we can boost nautilus's priority, it may (possibly) not be interrupted.  I couldn't figure out how to do this without root privileges.&lt;span style="font-size:180%;"&gt;&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;span style="font-size:180%;"&gt;Summary&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;In any event, this has been a lot of fun.  I learned more about nautilus &amp;amp; gnome that I thought possible.&lt;br /&gt;&lt;br /&gt;Thanks to all!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/28784075-115620952296134030?l=ezolt.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ezolt.blogspot.com/feeds/115620952296134030/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=28784075&amp;postID=115620952296134030' title='66 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/28784075/posts/default/115620952296134030'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/28784075/posts/default/115620952296134030'/><link rel='alternate' type='text/html' href='http://ezolt.blogspot.com/2006/08/thanks-for-all-fish.html' title='Thanks for All the Fish'/><author><name>ezolt</name><uri>http://www.blogger.com/profile/06993911806502857507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>66</thr:total></entry><entry><id>tag:blogger.com,1999:blog-28784075.post-115448710084356113</id><published>2006-08-01T19:19:00.000-07:00</published><updated>2006-08-01T22:20:16.653-07:00</updated><title type='text'>Ah ha!</title><content type='html'>A whole bunch has happened since I blogged last....&lt;br /&gt;&lt;br /&gt;For the most part, I figured out why nautilus startup time is so slow AND so variable.&lt;br /&gt;&lt;br /&gt;Nautilus is basically yields the CPU again and again as it starts up.  Even if nautilus is one of the first components launched, other components (ie, the panel, applets and more) will interrupt it, and prevent it from drawing the first icon on the screen.   Sometimes nautilus has the CPU for most of the time, and the startup times are very good.  However sometimes it will yield the CPU, and it will be seconds before it starts running again.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:180%;"&gt;First Blind Alley&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;As you can see is some of my last posts, I knew that nautilus was yielding the CPU.  However, I thought it was doing this because it was waiting for threads/components that weren't yet running. (In particular the gnome-vfs-daemon).&lt;br /&gt;&lt;br /&gt;I initially thought that nautilus also relied on gnome-volume-manager since gnome-volume-manager's  execution was often interleaved with nautilus's startup.  However, I found one run where this didn't happen, and nautilus started up just fine.. Therefore it can't be a dependency.  (I also discovered gnome-volume-manager was being launched by one of the gnome startup scripts)&lt;br /&gt;&lt;br /&gt;I modified the trash applet to preload the vfs-daemon before nautilus started up.   I modified gnome-session to call this hacked applet, wait 2 seconds, and then finally start nautilus.  This guaranteed that gnome-vfs-daemon was started before nautilus started.  This worked, sort-of, as nautilus start-up times were much more reproducible (as long as the panel wasn't running).  It is an imperfect solution, because that sleep is really unacceptable, but it could tell me if dependencies were the problem.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://photos1.blogger.com/blogger/3965/3054/1600/NautilusPreload.0.png"&gt;&lt;img style="cursor: pointer;" src="http://photos1.blogger.com/blogger/3965/3054/400/NautilusPreload.0.png" alt="" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The interesting thing to note is that nautilus blocked several times from when executing main to the icon-painting.  Nautilus did some handshaking with various threads, and this caused things to block.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:180%;"&gt;Nice-i-ness&lt;br /&gt;&lt;/span&gt;With this fix in place, I went for broke and turned on the panel.  The reproducible startup times that I had, were then completely shot to hell.  All of the blocking of the main nautilus process provided opportunities for the panel &amp; applet to run.  Once they started, nautilus might not execute again for a couple of seconds.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://photos1.blogger.com/blogger/3965/3054/1600/preload.png"&gt;&lt;img style="cursor: pointer;" src="http://photos1.blogger.com/blogger/3965/3054/400/preload.png" alt="" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;So, on a whim, I decide to run the gnome-panel with a nice value of 19.  This worked pretty well.  Most of the nautilus startup times were on the order of 0.5 seconds.  However, every so often, the times would jump.  This was because other non-panel processes (such as gnome-volume-manager) started running when nautilus blocked.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://photos1.blogger.com/blogger/3965/3054/1600/startup_nice.png"&gt;&lt;img style="cursor: pointer;" src="http://photos1.blogger.com/blogger/3965/3054/400/startup_nice.png" alt="" border="0" /&gt;&lt;/a&gt;&lt;span style="font-size:180%;"&gt;&lt;br /&gt;What is Next?&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;I'm currently trying to figure out if I can reduce he amount of blocking that nautilus is doing. &lt;br /&gt;First, I have to figure out exactly what it is waiting on.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/28784075-115448710084356113?l=ezolt.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ezolt.blogspot.com/feeds/115448710084356113/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=28784075&amp;postID=115448710084356113' title='28 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/28784075/posts/default/115448710084356113'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/28784075/posts/default/115448710084356113'/><link rel='alternate' type='text/html' href='http://ezolt.blogspot.com/2006/08/ah-ha.html' title='Ah ha!'/><author><name>ezolt</name><uri>http://www.blogger.com/profile/06993911806502857507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>28</thr:total></entry><entry><id>tag:blogger.com,1999:blog-28784075.post-115321840913535970</id><published>2006-07-18T02:45:00.000-07:00</published><updated>2006-07-18T04:44:03.743-07:00</updated><title type='text'>Startup Variability</title><content type='html'>I've figured out why the nautilus startup times are so variable.&lt;br /&gt;&lt;br /&gt;I removed both the window manager AND gnome-panel from the startup sequence to do my testing. I figure if I can remove the variability when only launching nautilus, then I should be able to add back in the others and make sure that things don't break.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:130%;"&gt;WHAT?&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Even with gnome-session ONLY launching nautilus in the gnone-session file, nautilus still has highly variable startup times.&lt;br /&gt;&lt;br /&gt;gnome-session starts a whole bunch of things internally, and then launches nautilus.&lt;br /&gt;&lt;br /&gt;NOTE: I am currently running with ALL of the stracing turned on, so times may be slower than previous blog posts.&lt;br /&gt;&lt;br /&gt;Example:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://photos1.blogger.com/blogger/3965/3054/1600/nautilus_no_gnome_panel_no_wm3-graph.0.png"&gt;&lt;img style="cursor: pointer;" src="http://photos1.blogger.com/blogger/3965/3054/400/nautilus_no_gnome_panel_no_wm3-graph.0.png" alt="" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Here's the same thing sorted:&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://photos1.blogger.com/blogger/3965/3054/1600/nautilus_no_gnome_panel_no_wm3_sorted-graph.0.png"&gt;&lt;img style="cursor: pointer;" src="http://photos1.blogger.com/blogger/3965/3054/400/nautilus_no_gnome_panel_no_wm3_sorted-graph.0.png" alt="" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Look at how different the best and worst times are....&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:130%;"&gt;WHY?&lt;/span&gt;&lt;br /&gt;gnome-session launches a whole bunch of things asynchronously.  Nautilus starts a whole bunch of things asynchronously.  These are all racing to finish, and some depend on each other.&lt;br /&gt;&lt;br /&gt;Fastest run:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://photos1.blogger.com/blogger/3965/3054/1600/output-20060709230824.png"&gt;&lt;img style="cursor: pointer;" src="http://photos1.blogger.com/blogger/3965/3054/400/output-20060709230824.png" alt="" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Slowest run:&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://photos1.blogger.com/blogger/3965/3054/1600/output-20060709230145.png"&gt;&lt;img style="cursor: pointer;" src="http://photos1.blogger.com/blogger/3965/3054/400/output-20060709230145.png" alt="" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Notice how a WHOLE bunch of processes startup between when nautilus first loads and the first icon is painted on the screen. (icon_container expose_event)&lt;br /&gt;&lt;br /&gt;If we could prevent nautilus from blocking, or at least let it show something on the screen before all of these other threads start, things would be much more consistent.&lt;br /&gt;&lt;br /&gt;NOTE: This totally sucks. Blogger is shrinking my pictures.  My flicker account won't let me show the whole version unless I go "pro".  I'll probably do that anyway.&lt;br /&gt;&lt;br /&gt;Until I figure out what to do, simply look for the "red" in the picture.  It indicates where new daemons were launched (exec'd).  When they happen early in the run, it means that the nautilus startup is interrupted.  When they happen late, it means that nautilus completed before they start.&lt;br /&gt;&lt;span style="font-size:130%;"&gt;&lt;br /&gt;What to ask? &lt;/span&gt;&lt;br /&gt;The questions that I really want to answer are:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Why does nautilus block during startup?  How can we stop it from blocking?&lt;br /&gt;(It looks as if it is waiting for other pieces of gnome to startup)&lt;br /&gt;&lt;br /&gt;&lt;/li&gt;&lt;li&gt;What daemons have to be pre-started for nautilus to begin instantly?  (or another way of looking at this is: What are dependencies does nautilus have on other pieces of gnome?)&lt;br /&gt;It looks like gnome-volume-manager and gnome-vfs-daemon need to be started early.&lt;br /&gt;&lt;br /&gt;&lt;/li&gt;&lt;li&gt;How can the dependencies be started before everything else?&lt;/li&gt;&lt;/ol&gt;&lt;span style="font-size:180%;"&gt;How do we answer these questions?&lt;/span&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Read through the raw strace logs.&lt;br /&gt;This is pretty helpful, but it is alot of data. I have to study exactly what the nautilus threads were doing when they yielded the CPU. It would be really nice if we could record when a process/thread is switched off of the CPU.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Extend Federico's visualization tool to show the interesting information&lt;br /&gt;I've currently extended it to show futex information, but it really generates an enormous image, and is very hard to see what is going on.  However, I think it would be useful to somehow show the parent/child relationship for some of the processes.  This would help determine the dependencies.. However, maybe this should be a different tool.&lt;br /&gt;&lt;/li&gt;&lt;li&gt; &lt;a href="http://sourceforge.net/projects/nptltracetool"&gt;The NPTL trace tool&lt;/a&gt;&lt;br /&gt;I just stumbled upon this, and I'm hopeful that this will give me some idea about how the various gnome threads interact. &lt;/li&gt;&lt;/ol&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/28784075-115321840913535970?l=ezolt.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ezolt.blogspot.com/feeds/115321840913535970/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=28784075&amp;postID=115321840913535970' title='23 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/28784075/posts/default/115321840913535970'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/28784075/posts/default/115321840913535970'/><link rel='alternate' type='text/html' href='http://ezolt.blogspot.com/2006/07/startup-variability.html' title='Startup Variability'/><author><name>ezolt</name><uri>http://www.blogger.com/profile/06993911806502857507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>23</thr:total></entry><entry><id>tag:blogger.com,1999:blog-28784075.post-115249688939878025</id><published>2006-07-09T18:10:00.000-07:00</published><updated>2006-07-09T19:22:18.186-07:00</updated><title type='text'>Cookbook</title><content type='html'>&lt;span style="font-size:180%;"&gt;Reproducing&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;Alright, a few post back, people have been asking me how to reproduce the work that I've been doing.&lt;br /&gt;&lt;br /&gt;First, I've be relying heavily on Federico's code to do the tracing/visualization.  This requires that gnome is run within strace.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:130%;"&gt;0) Boot your machine to init level 3.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;This will allow us to run startx, run through the gnome initialization cycle, and then end without adding stuff to gdm.&lt;br /&gt;&lt;span style="font-size:130%;"&gt;&lt;br /&gt;1) Sprinkle the gnome code with calls to program_log.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;First, I've created a header that I can include with the following code:&lt;br /&gt;gnome-profile.h&lt;br /&gt;&lt;blockquote&gt;&lt;br /&gt;&lt;br /&gt;#include &amp;lt;unistd.h&amp;gt;&lt;br /&gt;#include &amp;lt;sys/time.h&amp;gt;&lt;br /&gt;#include &amp;lt;stdlib.h&amp;gt;&lt;br /&gt;&lt;br /&gt;static void program_log (const char *format, ...)&lt;br /&gt;{&lt;br /&gt;va_list args;&lt;br /&gt;char *formatted, *str;&lt;br /&gt;struct timeval current_time;&lt;br /&gt;if (getenv("GNOME_PROFILING"))&lt;br /&gt;   {&lt;br /&gt;va_start (args, format);&lt;br /&gt;formatted = g_strdup_vprintf (format, args);&lt;br /&gt;va_end (args);&lt;br /&gt;gettimeofday(&amp;current_time,NULL);&lt;br /&gt;str = g_strdup_printf ("MARK: %s: %s",g_get_prgname(), formatted);&lt;br /&gt;g_free (formatted);&lt;br /&gt;access (str, F_OK);&lt;br /&gt;g_free (str);       &lt;br /&gt;}&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;unistd.h&gt;&lt;sys&gt;&lt;stdlib.h&gt;Example:&lt;br /&gt;  In nautilus-main.c in the first non-declaration line of 'main()' I have added:&lt;br /&gt;        program_log("%s Starting_Nautilus",__FUNCTION__);&lt;br /&gt; -and-&lt;br /&gt; In nautilus-icon-container.c in the first non declaration line of 'expose_event()', I have added:&lt;br /&gt;       program_log("icon_container expose_event");&lt;br /&gt;&lt;br /&gt;This is enough to time the nautilus startup.&lt;br /&gt;&lt;br /&gt;NOTE: This implementation is pretty hacky right now.   This should probably be pushed into a common library (glib?) somewhere, but right now it is handy because I don't have to rely on every application that I want to profile including a particular library.  I just add the "#include" and drop in some calls to "program-log" and I am off to the races.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:130%;"&gt;2)  Add stracing to xinitrc.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;When you launch gnome, you have to make sure strace is running,  so  I 've added calls to strace into my xinit file (for reasons described in previous posts) .&lt;br /&gt;&lt;br /&gt;Add something similar to the following to your ".xinitrc":&lt;br /&gt;&lt;/stdlib.h&gt;&lt;/sys&gt;&lt;/unistd.h&gt;&lt;blockquote&gt;&lt;span style="font-family:courier new;"&gt;&lt;/span&gt;&lt;span style="font-family:courier new;"&gt;"exec strace -e clone,execve,open,access -ttt -f -o /tmp/gnome.log /home/gnome/bin/jhbuild run gnome-session"&lt;/span&gt;&lt;/blockquote&gt;&lt;span style="font-size:130%;"&gt;3) Start GNOME with profiling turned on.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Now you can launch gnome with tracing turned on:&lt;br /&gt;&lt;blockquote&gt;env GNOME_PROFILING=1 startx&lt;br /&gt;&lt;/blockquote&gt;After this command has completed, an strace of the session will be sitting in tmp.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:130%;"&gt;4) Add a command to the session startup which will automatically teardown the session.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;I've added the following shell script to my gnome-session startup: (You may have to adjust the initial sleep if things take longer than 5 seconds to startup.)&lt;br /&gt;tear_down.sh:&lt;br /&gt;&lt;blockquote&gt;#!/bin/bash&lt;br /&gt;sleep 5&lt;br /&gt;gnome-session-save --kill&lt;br /&gt;sleep 5&lt;br /&gt;killall X&lt;br /&gt;&lt;/blockquote&gt;&lt;span style="font-size:130%;"&gt;5) Turn off the "IS it ok to logout prompt"? &lt;/span&gt;&lt;br /&gt;Run "gnome-session-properties" an disable 'ask on logout".&lt;br /&gt;&lt;br /&gt;Now, you should be able to run startx, have gnome start, and then exit back to the initial prompt.&lt;br /&gt;&lt;span style="font-size:130%;"&gt;&lt;br /&gt;6) Script the automatic timing and analysis of the startup/teardown.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;First, I created a script to automatically determine the time to start the session and time to start nautilus and output that in a file called "summary":&lt;br /&gt;(The script is called logs.py)&lt;br /&gt;&lt;blockquote&gt;#!/usr/bin/python&lt;br /&gt;import sys&lt;br /&gt;import string&lt;br /&gt;found_event=0&lt;br /&gt;&lt;br /&gt;for line in sys.stdin:&lt;br /&gt;&lt;br /&gt;__if line.find("Starting_Nautilus")!=-1:&lt;br /&gt;____start_time = string.atof(line.split()[1])&lt;br /&gt;&lt;br /&gt;__if line.find('execve("/home/gnome/bin/jhbuild')!=-1:&lt;br /&gt;____session_start_time = string.atof(line.split()[1])&lt;br /&gt;&lt;br /&gt;__if (line.find("icon_container expose_event")!=-1) and found_event==0:&lt;br /&gt;____expose_time = string.atof(line.split()[1])&lt;br /&gt;____found_event=1&lt;br /&gt;&lt;br /&gt;print "Start-&gt;Icon_Expose (ses):", expose_time-session_start_time, "(Naut):", expose_time-start_time&lt;br /&gt;&lt;/blockquote&gt;...&lt;br /&gt;Next, I created a script to run this AND Federico's graphical analysis tool.   It will create a directory with the time of the run, a copy of the log, and the picture of the execution.&lt;br /&gt;&lt;br /&gt;I call it "test2.sh":&lt;br /&gt;&lt;blockquote&gt;#!/bin/bash&lt;br /&gt;while /bin/true&lt;br /&gt;do&lt;br /&gt;DATE=`date +%Y%m%d%H%M%S`&lt;br /&gt;mkdir $DATE&lt;br /&gt;env GNOME_PROFILING=1  startx&lt;br /&gt;cp /tmp/gnome.log $DATE/&lt;br /&gt;~/plot-timeline.py $DATE/gnome.log -o $DATE/output-$DATE.png&lt;br /&gt;rm -f /tmp/*.log&lt;br /&gt;cat $DATE/gnome.log | ./logs.py | tee $DATE/summary&lt;br /&gt;sleep 3&lt;br /&gt;done&lt;br /&gt;&lt;/blockquote&gt; &lt;span style="font-size:130%;"&gt;7) Run for infinity.&lt;br /&gt;&lt;/span&gt;Now, you can run this for a long time, and just let gnome startup/teardown.  After 40 or so runs, I stop the loop, and see what happened.&lt;br /&gt;&lt;span style="font-size:130%;"&gt;&lt;br /&gt;8) Analyze the results&lt;/span&gt;&lt;br /&gt;Next I gather all of the results into a single file with:&lt;br /&gt;&lt;blockquote&gt;"grep Nau */summary &gt; results.csv"&lt;br /&gt;&lt;/blockquote&gt;I can then load this file into gnumeric and graph the results.&lt;br /&gt;&lt;br /&gt;9) (Extra) Prune the startup of the session.&lt;br /&gt;&lt;br /&gt;Currently, I am just starting nautilus (no WM, and no gnome-panel).&lt;br /&gt;&lt;br /&gt;I have a ~/.gnome2/session file that looks as follows:&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;&lt;/span&gt;&lt;blockquote&gt;&lt;span style="font-family:courier new;"&gt;[Default]&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;num_clients=3&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;1,id=default1&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;1,Priority=10&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;1,RestartCommand=nautilus --no-default-window --sm-client-id default1&lt;/span&gt;&lt;br /&gt;&lt;/blockquote&gt;...&lt;br /&gt;That's all for right now.  I found something interesting, but I'll save that for the next post.&lt;br /&gt;&lt;br /&gt;ps. HI WIFEZILLA!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/28784075-115249688939878025?l=ezolt.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ezolt.blogspot.com/feeds/115249688939878025/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=28784075&amp;postID=115249688939878025' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/28784075/posts/default/115249688939878025'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/28784075/posts/default/115249688939878025'/><link rel='alternate' type='text/html' href='http://ezolt.blogspot.com/2006/07/cookbook.html' title='Cookbook'/><author><name>ezolt</name><uri>http://www.blogger.com/profile/06993911806502857507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-28784075.post-115158058430208453</id><published>2006-06-29T03:58:00.000-07:00</published><updated>2006-06-29T04:41:35.743-07:00</updated><title type='text'>Bet ya bite a chip</title><content type='html'>I wanted to get something out earlier, but I've had a bit of nasty fever for the past few days. I promised in one of the blog comments to release the scripts that I am using for testing. I will do this very soon (probably in the next blog entry).  I have some cleanup that I need to do.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:180%;"&gt;&lt;span style="font-weight: bold;"&gt;fwrite-&gt;strace&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;In one of my past entries I mentioned that I was having trouble strace'ing startx, and that is what prevented me from using Federico's original patches.  Well, that was beginning to bite me.  First, my fwrite method did not handle threading very well. A message could get split a log was sent simultaneously by two different threads. (I think I could fix that though). Second, and more important, Federico's excellent visualization script wouldn't work.&lt;br /&gt;&lt;br /&gt;I determined that figuring out how to get strace to work was really the only way to go.&lt;br /&gt;&lt;br /&gt;First, it appears that the "SUID" problem was with strace and NOT with SELinux as I originally thought.  According to the strace man-page it is possible to create a suid strace binary, and that will allow suid binaries to work properly.   I created an suid-root strace, and that allowed me to "strace X".  However, when I tried to launch gnome with this binary it gave me errors (some thing about "use a helper application for SUID stuff").&lt;br /&gt;&lt;br /&gt;I had to take a different approach, so I changed my .xinitrc to launch an straced version of gnome rather than stracing startx directly.  The suid-root X would start normally, and then the stracing would begin.  That worked!&lt;br /&gt;&lt;br /&gt;Here's my currently working .xinitrc:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;.xinitrc&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;#!/bin/bash&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;exec strace -e clone,execve,open,access -ttt -f -o /tmp/gnome.log /home/gnome/bin/jhbuild run gnome-session&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Notice that I added the "-e" option to Federico's default "-ttt -f" command line.  This will ONLY record the clone, execve,open,access system calls.  This speeds up the run-time, and reduces the amount of stuff that needs to be dumped in the log file.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;font-size:180%;" &gt;All is not rosey&lt;/span&gt;&lt;br /&gt;Unfortunately, now that am using strace, the start-up times seem a bit more sluggish.  However, I have to update my scripts to automatically figure out how long things are taking.  (I've just been reading log times manually.)&lt;br /&gt;&lt;br /&gt;Also, and more importantly, my session will no longer shutdown by itself with strace turned on.  I don't know if this is a bug in my distribution (with strace/ptrace) or an actual race within the gnome code.  It makes it a real pain when I want to run 40 tests because I have to sit there and press cntrl-alt-backspace.  Even then things don't get properly cleaned up.  I am going to work on a big hammer to fix this.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;font-size:180%;" &gt;1000 words&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Federico suggested that I rearrange the order of nautilus vs. gnome-panel in the gnome-session file, and see if that changes the variability that I have been seeing.  I don't have a definitive answer on that yet.   However, I did notice that nautilus's execution does become interlaced with the gnome-vfs-daemon, and the gnome-panel.  I wonder if it is possible to start the gnome-vfs-daemon before both of them, so nautilus won't have to wait for it to startup and/or compete with the panel for its resources.  It would be really nice to know all of the dependencies of the gnome daemons.  That way we could start the most needed ones earlier, and the non-conflicting ones in a parallel.&lt;br /&gt;&lt;br /&gt;In Any event, here's a picture of my current startup.  Note: most of the functions with a name of "(null)" are probably the gnome-vfs-daemon.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://photos1.blogger.com/blogger/3965/3054/1600/output-20060629013949.png"&gt;&lt;img style="cursor: pointer;" src="http://photos1.blogger.com/blogger/3965/3054/320/output-20060629013949.png" alt="" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;NOTE: The long pause at the end is because the shutdown of gnome has hung...&lt;br /&gt;&lt;span style="font-size:180%;"&gt;&lt;br /&gt;Note to ESD: DON'T EVER RUN&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;I noticed this before when I was shutting things down, but I saw it again on my logs, and it was really annoying me.  I have esd completely disabled, yet on shutdown the esd process always starts.&lt;br /&gt;&lt;br /&gt;I traced it down to "gnome-session-save".  However, I don't yet know which part is calling it.&lt;br /&gt;&lt;br /&gt;gnome-session-save calls this:&lt;br /&gt;execve("/bin/sh", ["/bin/sh", "-c", "/home/gnome/gnome2/bin/esd -term"...], [/* 35 vars */]) = 0&lt;br /&gt;which in turn calls this:&lt;br /&gt;execve("/home/gnome/gnome2/bin/esd", ["/home/gnome/gnome2/bin/esd", "-terminate", "-nobeeps", "-as", "2", "-spawnfd", "15"], [/* 35 vars */]) = 0&lt;br /&gt;&lt;br /&gt;This is the pebble in my shoe.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:180%;"&gt;&lt;span style="font-weight: bold;"&gt;Cookbook&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;As I mentioned above, i don't yet have a cookbook that other people can follow in my footsteps and reproduce things.  I will do that for the next blog.  I need to do a little bit of clean-up, but then I can provide a patch.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/28784075-115158058430208453?l=ezolt.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ezolt.blogspot.com/feeds/115158058430208453/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=28784075&amp;postID=115158058430208453' title='34 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/28784075/posts/default/115158058430208453'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/28784075/posts/default/115158058430208453'/><link rel='alternate' type='text/html' href='http://ezolt.blogspot.com/2006/06/bet-ya-bite-chip.html' title='Bet ya bite a chip'/><author><name>ezolt</name><uri>http://www.blogger.com/profile/06993911806502857507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>34</thr:total></entry><entry><id>tag:blogger.com,1999:blog-28784075.post-115098065403845691</id><published>2006-06-22T05:45:00.000-07:00</published><updated>2006-06-22T06:50:08.716-07:00</updated><title type='text'>We're hunting wabbits</title><content type='html'>I've been trying to track down the variability that I mentioned in the last blog entry. It is beginning to feel like a threading/scheduling/race problem.&lt;br /&gt;&lt;br /&gt;When gnome is starting up, everything is competing for the CPU. 'gnome-panel' and 'nautilus' (among other things) are both trying to initialize at the same time.&lt;br /&gt;&lt;br /&gt;So, I've noticed two different things.&lt;br /&gt;&lt;ol&gt;&lt;li&gt;The first call to g_signal_connect in 'nautilus_application_instance_init' is fast in the fast case, and slow in the slow case.&lt;br /&gt;&lt;br /&gt;In particular, this call: 'g_signal_connect_object (gnome_vfs_get_volume_monitor (), "volume_unmounted", G_CALLBACK (volume_unmounted_callback), application, 0);'&lt;br /&gt;takes a long time (1.3 sec) when things are going slowly, but is very quick (0.2) secs when things are fast. This functions appears to add one or more threads that do a bunch of work.&lt;br /&gt;&lt;br /&gt;I've instrumented various functions in the file containing 'gnome_vfs_get_volume_monitor', and I can see that they are running, but I am not sure why. (I have yet to trace the execution from the signal connect to these functions actually running)&lt;br /&gt;&lt;br /&gt;&lt;/li&gt;&lt;li&gt;In the fast case, gnome-panel runs the code AFTER nautilus startsup. In the slow case, the gnome-panel execution is interspersed with the nautilus execution.&lt;br /&gt;&lt;/li&gt;&lt;/ol&gt;Like I said above, this feels like a locking/scheduling problem.  I suspect it is something like:&lt;br /&gt;In the slow case: nautlius needs to do some work, but somebody else (gnome-panel?) holds the lock, so nautlius sleeps a LONG time after it realizes this.&lt;br /&gt;&lt;br /&gt;In the fast case: nautilus grabs the lock, and keeps going until it is ready to run. Another possibility: For the fast case, it also might be that the gnome-volume-monitor stuff has already been initialized before nautilus needs it. However, my instrumentation shows that in the fast case, nautilus is the FIRST thing to call the gnome-volume-monitor code, so that wouldn't explain things.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/28784075-115098065403845691?l=ezolt.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ezolt.blogspot.com/feeds/115098065403845691/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=28784075&amp;postID=115098065403845691' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/28784075/posts/default/115098065403845691'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/28784075/posts/default/115098065403845691'/><link rel='alternate' type='text/html' href='http://ezolt.blogspot.com/2006/06/were-hunting-wabbits.html' title='We&apos;re hunting wabbits'/><author><name>ezolt</name><uri>http://www.blogger.com/profile/06993911806502857507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-28784075.post-115069308879977472</id><published>2006-06-18T21:47:00.000-07:00</published><updated>2006-06-19T04:58:39.836-07:00</updated><title type='text'>Strange and Magical world</title><content type='html'>&lt;span style="font-size:180%;"&gt;This IS a reproduction&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;I've been trying to reproduce the slow nautilus startup times that Federico had been seeing on the CVS version of Gnome that I have checked out.   I was reading through code first, but that was really the wrong thing to do.  I should have actually timed how long things were taking, and THEN if it was still a problem read through the code.&lt;br /&gt;&lt;br /&gt;That's what I did.&lt;br /&gt;&lt;br /&gt;On the good news/weird new side, my total nautilus startup times (from the beginning of nautilus main to the expose_event sent to the icon_container) have been much much lower than the times that Federico reported in his blogs.  My total startup time avg was ~2secs.  His time between gtk_main and the icon_container expose event was 2 seconds!  (His time was only including the time between when control was handed over to gtk, and when things were finally painted.  Mine included the whole nautilius initialization AND the time for gtk to paint things.)&lt;br /&gt;&lt;br /&gt;I don't know WHY things would be so much better.  I really wonder if strace was skewing his results like it was doing for the X server.  I was also doing this from a WARM cache (his might have been from a cold cache.)   I'm not sure.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:180%;"&gt;Highly Variability&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;I wrote a few wrapper scripts to automatically launch gnome, time nautilus's startup, and then gather the results.  I was curious to see how consistent the startup time was.&lt;br /&gt;&lt;br /&gt;Short answer: It wasn't.   I had a minimum time of 0.398 sec and a maximum time of 4.411 sec.&lt;br /&gt;&lt;br /&gt;I've sorted the 44 runs that I made by time, so here's a picture of it:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://photos1.blogger.com/blogger/3965/3054/1600/nautilus_startime2.png"&gt;&lt;img style="cursor: pointer;" src="http://photos1.blogger.com/blogger/3965/3054/320/nautilus_startime2.png" alt="" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;When I used to do SPEC CPU benchmarking, the variability from run-to-run would drive us crazy.  Some applications would be very fast in one run, and slow in another.  Ideally, you could have a run where all of the applications ran at their top speed.  It was frustrating.&lt;br /&gt;&lt;br /&gt;The compiler writers saw it differently.  If there was a variation, that meant it was possible for the application to always run as fast as the fastest time if they could only figure out what was slowing it down in the other times.&lt;br /&gt;&lt;br /&gt;The nice thing about the SPEC runs was that each of the programs were independent, so you really should be able to optimize things for each app to run at it's ideal time.  Gnome is a little different, since it is a complicated interlocking system.  The fast runs of Nautilus MIGHT be because some other startup program has already initialized the components that nautilus needs.  Hower, it does say that we have the possiblity for a VERY fast nautilus startup.&lt;br /&gt;&lt;br /&gt;Variablility makes it REALLY hard to optimize things, and can often times hide a race, so it really needs to be investigated.&lt;br /&gt;&lt;br /&gt;Ideally we could get nautilus to start in 0.398 secs.  Why doesn't it do it every time?  (This is what I'll be looking at for the next blog entry.  Stay tuned... )&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:180%;"&gt;Cold cache vs. Warm cache&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;So far, I've been looking at the warm cache startup time of Nautilus.  Why?&lt;br /&gt;&lt;br /&gt;Basically I see the warm cache time as the "best we can do".  As we eliminate I/O in the cold cache time, we will eventually get closer and closer to the warm cache time.&lt;br /&gt;&lt;br /&gt;So while it does make sense to also optimize the cold cache time (as this will be what the first user logging on to a system will experience), starting with the optimization of the warm cache will allow me to understand and optimize the different parts of nautilus, and then see how adding disk I/O slows them down.  Also, I/O is scary.  ;-) (Ie, Once I/O is added to the mix, things get much less repeatable.  Although the "flush the cache" app should help with that.)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/28784075-115069308879977472?l=ezolt.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ezolt.blogspot.com/feeds/115069308879977472/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=28784075&amp;postID=115069308879977472' title='14 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/28784075/posts/default/115069308879977472'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/28784075/posts/default/115069308879977472'/><link rel='alternate' type='text/html' href='http://ezolt.blogspot.com/2006/06/strange-and-magical-world.html' title='Strange and Magical world'/><author><name>ezolt</name><uri>http://www.blogger.com/profile/06993911806502857507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>14</thr:total></entry><entry><id>tag:blogger.com,1999:blog-28784075.post-115038098258055382</id><published>2006-06-15T06:47:00.000-07:00</published><updated>2006-06-15T07:16:22.656-07:00</updated><title type='text'>She sells Seashells</title><content type='html'>No concrete progress right now, as I've basically been doing research.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:180%;"&gt;&lt;span style="font-weight: bold;"&gt;Reading Code&lt;/span&gt;&lt;br /&gt;&lt;/span&gt;I've been spending alot of 'heads-down' time reading through the nautilus code and docs in the nautilus directory.  I'm beginning to understand how things fit together, and I can see the parts of nautilius that Federico analyzed/optimzed before.&lt;br /&gt;&lt;br /&gt;I am currently looking at why there is such a long time between the call to gtk_main, and the first call to "nautilus-icon-container.c expose_event()".  I've been stepping through the code with gdb just to get some idea of how it all fits together. This is horrible from a timing point of view, but at least I can see what parts of this enourmous system are called when.&lt;br /&gt;&lt;br /&gt;I know that this has been said before, but the asynchronous nature of nautilus makes it tricky to figure out what is happening and why.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;font-size:180%;" &gt;Tracing&lt;/span&gt;&lt;br /&gt;I also understand a little bit about how Federico's tracing code works.  It is a clever (ab)use of the 'access ' system call to basically put an arbitrary message in strace's message buffers.   Strace provides the time stamps, and handles the case where multiple threads might be calling the log function at the same time.&lt;br /&gt;Neat.&lt;br /&gt;&lt;br /&gt;BTW. If we would wrap this tracing in a check for an env variable (GNOME_TRACING? GNOME_LOG?), we should be able to include this in the source and ship it.   That way we can trace/time this stuff in the field. A call to getenv is SO lightweight, that benefit should outweigh the cost.&lt;br /&gt;&lt;br /&gt;....&lt;br /&gt;I haven't been able to use Federico's patches directly because I can't get strace to work on 'startx' when running FC5.  It probably doesn't like a user stracing a SUID (X) executable.  I wrote another variation which doesn't use strace, and instead opens a file, and dumps a timestamp and the message when the log function is called.&lt;br /&gt;&lt;br /&gt;BTW.  Does anyone know how to make strace work on FC5? &lt;br /&gt;&lt;br /&gt;When I start to 'strace startx' it doesn't work and I get some SElinux errors. I know this is vague, but maybe it rings a bell for someone. &lt;span style="font-weight: bold;font-size:180%;" &gt;&lt;br /&gt;X startup&lt;/span&gt;&lt;br /&gt;Last time I talked about some Xorg developers getting a 2X imrpovement in startup time.  Turns out it was closer to 5%. (Which is still nice).  &lt;br /&gt;&lt;br /&gt;Why 2X?  The original timings where when running strace on X.  Since the patchs removed MANY MANY system calls, it greatly decreased the overhead generated by strace, and as a result greatly sped-up the startup.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;font-size:180%;" &gt;Graduated&lt;/span&gt;&lt;br /&gt;On a personal note, I got my Masters last Thursday.  I flew from Pittsburgh, PA to Cambridge, MA and back again within 24 hours.  I flew through Philadelphia and sat in the airpot for HOURS as weather conditions slowed down the flights.  (Why can't all airports be like Pittsburgh and give free wifi?)&lt;br /&gt;&lt;br /&gt;I didn't get home to 3am, and worked the next morning. Ugh.&lt;br /&gt;&lt;br /&gt;--Phil&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/28784075-115038098258055382?l=ezolt.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ezolt.blogspot.com/feeds/115038098258055382/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=28784075&amp;postID=115038098258055382' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/28784075/posts/default/115038098258055382'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/28784075/posts/default/115038098258055382'/><link rel='alternate' type='text/html' href='http://ezolt.blogspot.com/2006/06/she-sells-seashells.html' title='She sells Seashells'/><author><name>ezolt</name><uri>http://www.blogger.com/profile/06993911806502857507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-28784075.post-114952490908138868</id><published>2006-06-05T09:18:00.001-07:00</published><updated>2006-06-05T09:40:40.110-07:00</updated><title type='text'>Sparklines and other goodies</title><content type='html'>&lt;span style="font-size:180%;"&gt;Visualization of Perf data&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;I'm spending a little bit of time setting up my tools to get starting with the profiling. I want to be able to combine the information that Federico's patches provides with the CPU usage information provided by bootchart. Once I have this working, I will be able to run a test, and see all of the perf data in a single picture.&lt;br /&gt;&lt;br /&gt;I'm using Tufte's ideas of sparklines (&lt;a href="http://en.wikipedia.org/wiki/Sparkline"&gt;http://en.wikipedia.org/wiki/Sparkline&lt;/a&gt;) to convey how the CPU usage changes over time. I'm adapting a python script that can be used to generate sparklines (&lt;a href="http://bitworking.org/projects/sparklines/"&gt;http://bitworking.org/projects/sparklines/&lt;/a&gt; ).&lt;br /&gt;&lt;br /&gt;Right now, it will read the data files generated by the bootchartd daemon, so you can pick your visualization (native bootchart -or- sparkline) after the data has been generated.&lt;br /&gt;&lt;br /&gt;I added the ability to automatically add tags on different events, so that I can integrate the profiling events generated by Federico's patches into the graph.&lt;br /&gt;&lt;br /&gt;Here's a current sample: (I know, it needs a legend.. Right now the graph is of the process's CPU usage (combined user+system))&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://photos1.blogger.com/blogger/3965/3054/1600/startx_FC4.png"&gt;&lt;img style="cursor: pointer;" src="http://photos1.blogger.com/blogger/3965/3054/320/startx_FC4.png" alt="" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;(OH, BTW: This patch (&lt;a href="http://bugzilla.gnome.org/show_bug.cgi?id=319682"&gt;http://bugzilla.gnome.org/show_bug.cgi?id=319682&lt;/a&gt;) to add sparklines should DEFINITELY be included in gnome-system-monitor)&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:180%;"&gt;Gathering Profiling Data&lt;/span&gt;&lt;br /&gt;I've been thinking a bit about how Federico added the current profiling information, and I think it is the right way to go. I don't need the capabilities of DBUS right now, because I can do all of the analysis after the run has completed.&lt;br /&gt;&lt;br /&gt;I think that Federico's profiling changes could be included in the mainline of gnome if we wrapped them in a test of an environmental variable (say GNOME_PROFILING). A getenv is trivial compared to the other stuff that goes on in the libraries/apps, and it would allow us to profile stuff in the field as well.&lt;br /&gt;&lt;br /&gt;Once I finish up the visualization stuff, I will add the that to his profiling patches. I am probably also going to add a printout of the "pid" (rather than program name), so that I merge the data from bootchart and the patches.&lt;br /&gt;&lt;span style="font-size:180%;"&gt;&lt;br /&gt;X startup time&lt;/span&gt;&lt;br /&gt;It looks like Thomas Leonard and Daniel Stone have come up with a way to dramatically speed up X server startup.&lt;br /&gt;&lt;br /&gt;Combining a patch in bugzilla and patch developed on the list by Thomas and Daniel, the there is almost a 2X speedup in the X server startup time.&lt;br /&gt;&lt;br /&gt;For more info follow the thread started here:&lt;br /&gt;&lt;a href="http://lists.freedesktop.org/archives/xorg/2006-June/015746.html"&gt;       http://lists.freedesktop.org/archives/xorg/2006-June/015746.html&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/28784075-114952490908138868?l=ezolt.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ezolt.blogspot.com/feeds/114952490908138868/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=28784075&amp;postID=114952490908138868' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/28784075/posts/default/114952490908138868'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/28784075/posts/default/114952490908138868'/><link rel='alternate' type='text/html' href='http://ezolt.blogspot.com/2006/06/sparklines-and-other-goodies_05.html' title='Sparklines and other goodies'/><author><name>ezolt</name><uri>http://www.blogger.com/profile/06993911806502857507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-28784075.post-114918748958999633</id><published>2006-06-01T10:59:00.000-07:00</published><updated>2006-06-01T11:44:49.616-07:00</updated><title type='text'>Slow X startup &amp; Tracking initialization</title><content type='html'>After discovering that X startup takes a while even with the simplest configuration, I did a little more digging, and found that the PCI probe code was accessing nonexistent "/proc/pci" files about 8000 times!&lt;br /&gt;&lt;br /&gt;Two people responded to the timing request (Thanks guys!), and their startup times were about the same as mine.  Oddly enough my wife's nvidia base laptop was 2.2 second, but it still have the hammering of "/proc/pci".&lt;br /&gt;&lt;br /&gt;I've send mail off the X developers, and hopefully they can fix it. &lt;br /&gt;&lt;br /&gt;More details at: &lt;br /&gt;http://lists.freedesktop.org/archives/xorg/2006-June/015746.html&lt;br /&gt;&lt;br /&gt;Update: &lt;br /&gt;It looks like davejhas already found this: http://kernelslacker.livejournal.com/34960.html&lt;br /&gt;&lt;br /&gt;--&lt;br /&gt;&lt;br /&gt;Also I've talked with Federico, and J5 about how to track when different applications startup and complete initialization. It looks like sending timestamped messages over the DBUS is a possibility. &lt;br /&gt;&lt;br /&gt;I might also be able to get what I want by recording all of the timestamps of the major events, and then figuring out after the fact what is happening where.   (This is pretty much was Federico's timing patches did.)  If I sleep for a long enough time before starting the session close, I won't shutdown before everything is complete.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/28784075-114918748958999633?l=ezolt.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ezolt.blogspot.com/feeds/114918748958999633/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=28784075&amp;postID=114918748958999633' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/28784075/posts/default/114918748958999633'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/28784075/posts/default/114918748958999633'/><link rel='alternate' type='text/html' href='http://ezolt.blogspot.com/2006/06/slow-x-startup-tracking-initialization.html' title='Slow X startup &amp; Tracking initialization'/><author><name>ezolt</name><uri>http://www.blogger.com/profile/06993911806502857507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-28784075.post-114896341812178508</id><published>2006-05-29T21:15:00.000-07:00</published><updated>2006-05-30T07:38:45.456-07:00</updated><title type='text'>X startup &amp; other goodies</title><content type='html'>I've been automating the generation of the boot charts for gnome-session.  (More on that later)&lt;br /&gt;&lt;br /&gt;I've begun to notice a few things.&lt;br /&gt;&lt;br /&gt;1) X takes a long time (3-4 seconds) to startup even with a warm cache.&lt;br /&gt;&lt;br /&gt;In all of my warm-cache charts, gnome-session didn't execute until a few seconds after X started.&lt;br /&gt;&lt;br /&gt;I was wondering if this was something to do with the gnome-session (ie, it has a lot of libraries that need to be loaded, etc.), so I decided to start X with the simplest of all xapps "xdpyinfo".&lt;br /&gt;&lt;br /&gt;Even with this simple application, the whole startup/teardown takes 3-4 seconds.&lt;br /&gt;&lt;br /&gt;It is easy to reproduce this:&lt;br /&gt;A) Drop to the console mode (init 3)&lt;br /&gt;B) Create a script called "startup-xdpyinfo.sh"&lt;br /&gt;&lt;br /&gt;"xinit /usr/bin/xdpyinfo -- /usr/bin/X :0"&lt;br /&gt;C) time "startup-xdpyinfo.sh"&lt;br /&gt;&lt;br /&gt;(NOTE: I would be really interested to hear what other's times are. I am using the Xorg 7.0 release from FC5, and "ATI Radeon XPRESS 200M 5955", and my times are 3-4 seconds for the entire startup/shutdown. )&lt;br /&gt;&lt;br /&gt;When running bootchart, I get the following (all of my runs look very similar to this) :&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://photos1.blogger.com/blogger/3965/3054/1600/xdpy-05290851.10.png"&gt;&lt;img style="cursor: pointer;" src="http://photos1.blogger.com/blogger/3965/3054/320/xdpy-05290851.10.png" alt="" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Here's the script I used to generate the graphs:&lt;br /&gt;&lt;br /&gt;#!/bin/bash&lt;br /&gt;TIME=`date +%m%d%H%M.%S`&lt;br /&gt;script -c "time /sbin/bootchartd start /home/gnome/no_xinit/startup-xdpyinfo.sh" ./xdpy-$TIME.log&lt;br /&gt;bootchart bootchart.tgz&lt;br /&gt;#rm bootchart.tgz&lt;br /&gt;mv ./bootchart.svgz ./xdpy-$TIME.svgz&lt;br /&gt;&lt;br /&gt;NOTE: If I run X with the "-probeonly" option, the times are only 2&lt;br /&gt;seconds, so it is appears as if X is probing for 2 seconds.&lt;br /&gt;&lt;br /&gt;Here's a graph of that:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://photos1.blogger.com/blogger/3965/3054/1600/xdpy-05291614.56.png"&gt;&lt;img style="cursor: pointer;" src="http://photos1.blogger.com/blogger/3965/3054/320/xdpy-05291614.56.png" alt="" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;2) Components don't startup in the same sequences during the login.&lt;br /&gt;&lt;br /&gt;The startup sequence is not always constant.  There may be some inefficiencies because of this.&lt;br /&gt;&lt;br /&gt;If Process A depends on Process B, but they start in parallel, Process A could be waiting, but Process B can't run because Process A is on the CPU.&lt;br /&gt;&lt;br /&gt;However, you also don't want to add dependencies that don't exist.&lt;br /&gt;&lt;br /&gt;I have to look into this more.  Here's my first pass at determining the startup order:&lt;br /&gt;&lt;br /&gt;A) gnome-session -&gt; gconfd-2 -&gt; gnome-keyring-daemon -&gt; bonobo-activation-run-query-&gt;&lt;br /&gt;&lt;br /&gt;B) gnome-settings- -or- metacity  (Start in Parallel?)&lt;br /&gt;&lt;br /&gt;C) (Always after Metacity) gnome-panel -&gt; Nautilus&lt;br /&gt;&lt;br /&gt;D) The rest:&lt;br /&gt;&lt;br /&gt;gnome-volume-manager -&gt; gnome-vfs-daemon-&gt; wnck-applet -&gt; gnome-settings -&gt; notification-ar -&gt; clock-applet -&gt; mixer_applet2 -&gt; mapping-deamon -&gt;&lt;br /&gt;&lt;br /&gt;3) xrdb is almost always in a "zombie" state.&lt;br /&gt;&lt;br /&gt;All of the graphs that I've looked at show the 'xrb' command started by gnome-settings becomes a Zombie. This might be related to an error message that I receive about "gnome-settings-daemon" failing during startup.&lt;br /&gt;&lt;br /&gt;NOTE:  I was able to initiate a gnome-session shutdown programmability (Thanks Vincent) using "gnome-session-save --kill".&lt;br /&gt;&lt;br /&gt;Questions:&lt;br /&gt;1) What is the most appropriate time to start a shutdown? (ie, how do we know that everything is up and initialized? ) Currently, the shutdown command is is one of the startup programs that gnome-session runs.&lt;br /&gt;&lt;br /&gt;However, since the startup programs don't run in any particular order, it can begin the shutdown early or late in the startup cycle. This means that there is a large variation in the total startup/shutdown time. This is annoying when trying to profile, and improve startup time.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/28784075-114896341812178508?l=ezolt.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ezolt.blogspot.com/feeds/114896341812178508/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=28784075&amp;postID=114896341812178508' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/28784075/posts/default/114896341812178508'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/28784075/posts/default/114896341812178508'/><link rel='alternate' type='text/html' href='http://ezolt.blogspot.com/2006/05/x-startup-other-goodies.html' title='X startup &amp; other goodies'/><author><name>ezolt</name><uri>http://www.blogger.com/profile/06993911806502857507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-28784075.post-114865348515359787</id><published>2006-05-26T06:49:00.000-07:00</published><updated>2006-05-26T07:24:45.166-07:00</updated><title type='text'>First steps and Gnome-startup visualization</title><content type='html'>Hi all,&lt;br /&gt;My project was accepted for Google Summer of code 2006.  Wahoo!&lt;br /&gt;&lt;br /&gt;I'll be looking at optimizing the startup performance of Nautilus to both make it faster to start and easier to profile. I've been around the block a bit for profiling (Buy my &lt;a href="http://www.amazon.com/gp/product/0131486829/ref=sr_11_1/102-7043538-9042545?%5Fencoding=UTF8"&gt;book&lt;/a&gt;, wink, wink... ) , and I've been wanting to do some optimization for GNOME for a while.  Now I have a good reason, sweet.&lt;br /&gt;&lt;br /&gt;In any event, while I wait for jhbuild to complete on my dodgy internet connection,  I have been playing with &lt;a href="http://www.bootchart.org/"&gt;bootchart&lt;/a&gt; to see what the system is doing when gnome starts up.&lt;br /&gt;&lt;br /&gt;A lot of work had been done to optimize the system startup time using bootchart, but it can also be used to track the startup of anything. (In this case, I track "startx")&lt;br /&gt;&lt;br /&gt;Here's how to do it:&lt;br /&gt;1) Download and build the rpm. (You'll need a bunch of java libraries)&lt;br /&gt;&lt;br /&gt;2) Install the RPM.&lt;br /&gt;&lt;br /&gt;3) reboot your machine, and start in init level 3. (pass "3" to the boot parameters or modify your inittab)&lt;br /&gt;&lt;br /&gt;4) log in as root&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;(We'll now gather the "cold disk cache" statistics)&lt;/span&gt;&lt;br /&gt;5) Tell bootchart to monitor startx and any processes that it generates: "bootchartd start startx"&lt;br /&gt;&lt;br /&gt;6) Log out as soon as possible.&lt;br /&gt;&lt;br /&gt;7) Run "bootchart".  This will generate an SVG file named "./bootchart.svgz" which contains the graph of the startup time.&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;&lt;br /&gt;(Now we'll gather the "warm disk cache" statistics.)&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;8) Tell bootchart to monitor startx and any processes that it generates: "bootchartd start startx"&lt;br /&gt;&lt;br /&gt;9) Log out as soon as possible.&lt;br /&gt;&lt;br /&gt;10) Run "bootchart".  This will generate an SVG file named "./bootchart.svgz" which contains the graph of the startup time.&lt;br /&gt;&lt;br /&gt;Here is an example from FC4:&lt;br /&gt;&lt;br /&gt;FC4 cold disk cache:&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://photos1.blogger.com/blogger/3965/3054/1600/FC4-startx-cold-cache.png"&gt;&lt;img style="margin: 0pt 10px 10px 0pt; float: left; cursor: pointer;" src="http://photos1.blogger.com/blogger/3965/3054/320/FC4-startx-cold-cache.png" alt="" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;FC4 warm disk cache:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://photos1.blogger.com/blogger/3965/3054/1600/FC4-startx-warm-cache.png"&gt;&lt;img style="margin: 0pt 10px 10px 0pt; float: left; cursor: pointer;" src="http://photos1.blogger.com/blogger/3965/3054/320/FC4-startx-warm-cache.png" alt="" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Observations:&lt;br /&gt;1) esd seems to start when we try to logout.  That's pretty annoying, and I'm sure it is a bug.&lt;br /&gt;2) In the warm cache case, xkbcomp does a lot of disk I/O.&lt;br /&gt;&lt;br /&gt;Here's the same thing for FC5:&lt;br /&gt;FC5 gnome startup (cold disk cache):&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://photos1.blogger.com/blogger/3965/3054/1600/FC5-gnome-startup-cold.png"&gt;&lt;img style="cursor: pointer;" src="http://photos1.blogger.com/blogger/3965/3054/320/FC5-gnome-startup-cold.png" alt="" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;FC5-Gnome startup (warm disk cache):&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://photos1.blogger.com/blogger/3965/3054/1600/FC5-gnome-startup-warm.png"&gt;&lt;img style="cursor: pointer;" src="http://photos1.blogger.com/blogger/3965/3054/320/FC5-gnome-startup-warm.png" alt="" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Observations: &lt;br /&gt;&lt;br /&gt;1) gconfd-2 chews up a chunck of CPU time&lt;br /&gt;2) The esd bug still appears (look at the cold cache chart) &lt;br /&gt;&lt;br /&gt;As I see more things, I'll post them.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/28784075-114865348515359787?l=ezolt.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ezolt.blogspot.com/feeds/114865348515359787/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=28784075&amp;postID=114865348515359787' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/28784075/posts/default/114865348515359787'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/28784075/posts/default/114865348515359787'/><link rel='alternate' type='text/html' href='http://ezolt.blogspot.com/2006/05/first-steps-and-gnome-startup.html' title='First steps and Gnome-startup visualization'/><author><name>ezolt</name><uri>http://www.blogger.com/profile/06993911806502857507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry></feed>
