[Prev][Next][Index][Thread]

Quickcam device driver for Linux



jkf@tiger.franz.com,Internet writes:
>>> The
>>> viewer reads in a image via the parallel port, each read of the parallel
>>> port
> consists of a context switch from user space to kernel space and back
>>> to user space ag
>ain.
>
>
> Is the switch to kernel space necessary in Linux?  I assumed that the
>ioperm() cal
>l gave permissions to the access the parallel port i/o addresses
>to the user-mode process.
> 

Yes, this is true, but ioperm() is very very very *Intel* linux specific.
Linux is no longer an Intel-only product, it is running on many many
platforms. Also this will not work at all with non-linux.
>
> 
>>> Then the server pipes the data upstream to the viewer,
>>> where each write on the pi
>pe constitutes another context switch from user
>>> space to kernel space and back again,
>
>  Of course the server only sends data up the pipe when it has
>captured a whole image.  If
> the image has an interesting depth and
>size then this write isn't going to happen more th
>an five times
>a second, so the kernel overhead is insignificant
>.   
>
Sure....I'm not saying performance would definately be a problem, I'm just
saying that I would be concerned about it as a *potential* source of
problems. Ummm...I don't have a qcam reference right with me, but the max
rez I think is something like 300x200 roughly or somethign around there
correct? Let's use those numbers for a second, and assume that the max
color depth is 4 bits per pixel (16 shades), that's 30K of data 5 times a
second, that's 150K per second between the client and server apps, then the
same 150K per second  (plus aditional overhead -- namely double (300K) if
we're talking to an 8 bit per pixel X11 server) between the client app and
the X11 server in our example scenario.so you have a total of about 300K or
more of data per second traveling through the network layer of the kernel
continuously, I'm not so sure I'd be so quick to call that insignificant.

Don't get me wrong though, I think there should be an image server, but I
don't think the image server should know anything about the camera device
hardware, I think that should be a normal device interface just like any
other device. There could be situations for example, where the amount of
network traffic (or the type of traffic) on the particular host may
preclude such a method.

>  Of course I'm just talking about the Quickcam.   There are plenty of
>other video devices
> out there, most based on add-on boards and
>attached video camera,  and each with its own 
>low level programming 
>interface.    I doubt that there is enough in common be
>tween them
>and the Quickcam that you will be able to write a single /dev/camera
>driver, or
> even a common programming interface that really expresses
>the unique features of each cam
>era (although you could settle for
>the Video for Windows api, as that's know to work).   

No, you don't write a single device driver for all devices, let's take the
ethernet cards for example, you have a Novell NE2000 device driver and an
DEC xxx-200 (or whatever?), they both have separate linux device drivers,
and depending on your hardware, you install the appropriate one. Once
that's done, the API remains the same, so the system beyond the device
driver itself cannot tell the difference between one or the other even if
the hardware itself is vastly different. Imagine if every application had
to know about the particular ethernet hardware installed on the system, it
would be rediculous obviously. So, what I'm saying is you write a single
*API* for all camera devices and then you need a device driver for each
make of camera, just exactly in the same fashion as all the other devices,
a camera is no different than any other streaming device.

Sure, there is plenty in common between cameras, they all generate a matrix
of pixels, they all have varying numbers of color planes and resolutions,
some have brightness and contrast control, they can all be in some sort of
"auto refresh" mode or they can be in "snap shot" mode. Some have zoom
capability, some don't, some surveillance types have x/y pivot control,
some don't, most are fixed focus but you could have focus control in the
API anyway. To design a unix camera device driver, one must imagine a
hypothetical camera capable of all the things a camera can do (within
reason), then design an ioctl() based API that allows for all the different
operating conditions. Once this is done, one then applies this generic API
to the QuickCam(tm) (and/or others) in the form of a device driver.  You
can program the device driver to support exclusive (locked) and
*optionally* shared (unlocked) mode.  In exclusive mode, the process that
locked the device has exclusive use of it and all other attempts to access
(open()) the device will be refused in the standard way. In the case of
shared mode, you will have to have a control block within the device driver
for each "open()" session that keeps track of what mode the camera should
be in for that session, and then you must switch control blocks and reset
the camera modes based on whoever is doing the access at any given time,
obviously shared mode will be trickier to impliment and there will be a
performance penalty when multiple processes are accessing the device, but
this is true with most other devices too. Never the less, the device driver
always keeps the camera in a "known" state when nobody is using it (for the
case of "cat /dev/camera" or other generic apps that are not nessesarily
camera "wise"). When a "camera wise" user space app access the device
driver, it does it's first ioctl() to determine the capabilities and
current settings of the camera, then it does another ioctl() to set the
appropriate operating modes, etc...then just reads the data as appropriate
and does a close() when it's done.

Anyway, what I'm describing here is the proper and correct way to impliment
*any* device interface at it's lowest level on any unix system.  Any other
way should be considered  a quick and dirty hack, not to say that quick and
dirty hacks are a bad thing, they're great for proof of concept, but for
the long haul, I think we need a proper device driver interface and some
appropriate minimal tools suite to compliment it.....

An skeletal example of a user space app using this method:

---------clipty clip-------------

/* FILE: camera_api.h */
/* include this file in camera applications */

// some limits on array sizes
#define MAX_CAM_REZ	16	// max number of different resolutions
#define MAX_CPLANE	16	// max number of different num of color planes

// bits to specify which parameters we're setting
#define	CAM_REZ		0x00000001
#define	CAM_CPLANE	0x00000002
#define	CAM_PIVOT	0x00000004
#define	CAM_FOCUS	0x00000008
#define	CAM_BRIGHT	0x00000010
#define	CAM_CONTRAST	0x00000020

typedef struct {
	int	width;
	int	height;
} tCamResolution;

typedef struct {
	unsigned int		SetArgs;
//
	tCamResolution	CamRez[MAX_CAM_REZ];
	short			CurrentRez;
	short			NumRez;
//
	int				BitsPerPixel[MAX_CPLANES];	 // color depth(s)
	short			CurrentCPlane;
//
	Boolean			HasPivot;
	short			CurrentXPos;		//  0=center, <0 = left, >0 = right
	short			CurrentYPos;		//  0=center, <0 = up, >0 = down
//
	Boolean			HasFocus;
	short			FocusPos;		// 0=infinity...32767=max
//
	Boolean			HasBrightness;
	short			BrightnessPos;
//
	Boolean			HasContrsat;
	short			ContrastPos;
//
	// etc...

} tCamParam;

typedef struct {
	unsigned char	*pData;
} tCamData;


-----------clipity clip----------


/* FILE: camera_app.cpp */
/* open device, set modes, read device, close device */
...
...
#include <sys/ioctl.h>
#include <camera_api.h>
...
...
void open_camera(void)
{
	int	fd;
	if((fd=open("/dev/camera",O_RDONLY))>=0) {
		int			n;
		tCamParams 	CamParams;
		tCamData		CamData;

		// get the current camera parameters
		CamParams.SetArgs = 0;
		ioctl(fd,IOC_OUT,&CamParams)		
		...
		...

		printf("Camera Resolutions: ");
		for(n=0;n<CamParams.NumRez;n++) {
			printf("%dx%d",
			CamParams.CamRez[n].width,CamParams.CamRez[n].height);	
		}
		printf("\n");		...
		...
		...

		// set a new camera resolution (2) and color depth (3)...
		CamParams.SetArgs = (CAM_REZ|CAM_CPLANE);
		CamParams.CurrentRez = 2;
		CamParams.CurrentCPlane = 3;
		ioctl(fd,IOCTL_IN,&CamParams)
		...
		...

		// get a frame buffer of the size we'll need...
		CamData.pData = malloc(
					CamParams.CamRez[CurrentRez].width *
					CamParams.CamRez[CurrentRez].height *
					(CamParams.BitsPerPixel[CamParams.CurrentCPlane] / 8)
					);
		...
		...

		// read the data...
		while(WeStillReadData) {
			ioctl(fd,IOC_OUT,&CamData);
			// do something with the image....let's say
			// we're blitting it to an X11 window...
		}
		...
		...

		// put the camera back to it's default state...
		close(fd);
	}
	else {
		perror("/dev/camera");
	}
}

----- clipity clip---------

and so on...you get the idea....

Now if this application where a digital video player, on a per frame basis,
you would only have one context switch for the read, and then the context
switching and network overhead associated with the X11 display which of
course is unavoidable if you're talking X11. You can still view the image
over a network as is inherent with X11 but in this case, the only network
traffic is the actual bitmap output to the screen (i.e. one trip instead of
two). The point here is that local (i.e. non-network) applications can
operate while consuming much less CPU resource than if the image where
looping back from a network server. In the case where you want a network
image server, you write a similare app as above, but blit the data down a
socket pipe rather than to an X11 window basically.

Mike Sharkey
X11 Development
SoftArc Inc.
msharkey@softarc.com


Follow-Ups: References: