Harald Welte [Wed, 10 Aug 2005 03:26:03 +0000 (20:26 -0700)]
[DCCP]: make <linux/dccp.h> include-able from userspace
The protocol header files in <linux/foo.h> are usually structured in a
way to be included by userspace code. The top section consists of
general protocol structure definitions, typedefs, enums - followed by
an #ifdef __KERNEL__ section.
Currently <linux/dccp.h> doesn't follow that convention and can
therefore not be used from userspace. However, for example iptables'
libipt_dccp.c actually needs various definitions from there.
Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Use const where possible and get rid of EXTRACT() macro
that was never used.
Signed-off-by: Stephen Hemmigner <shemminger@osdl.org> Signed-off-by: Robert Olsson <robert.olsson@its.uu.se> Signed-off-by: David S. Miller <davem@davemloft.net>
Robert Olsson [Wed, 10 Aug 2005 03:25:06 +0000 (20:25 -0700)]
[IPV4]: fib_trie: Use ERR_PTR to handle errno return
Signed-off-by: Robert Olsson <robert.olsson@its.uu.se> Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Olof Johansson [Wed, 10 Aug 2005 03:24:39 +0000 (20:24 -0700)]
[IPV4]: FIB Trie cleanups.
Below is a patch that cleans up some of this, supposedly without
changing any behaviour:
* Whitespace cleanups
* Introduce DBG()
* BUG_ON() instead of if () { BUG(); }
* Remove some of the deep nesting to make the code flow more
comprehensible
* Some mask operations were simplified
Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Robert Olsson <robert.olsson@its.uu.se> Signed-off-by: David S. Miller <davem@davemloft.net>
Yasuyuki Kozakai [Wed, 10 Aug 2005 03:24:15 +0000 (20:24 -0700)]
[NETFILTER]: return ENOMEM when ip_conntrack_alloc() fails.
This patch fixes the bug which doesn't return ERR_PTR(-ENOMEM) if it
failed to allocate memory space from slab cache. This bug leads to
erroneously not dropped packets under stress, and wrong statistic
counters ('invalid' is incremented instead of 'drop'). It was
introduced during the ctnetlink merge in the net-2.6.14 tree, so no
stable or mainline releases affected.
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Harald Welte [Wed, 10 Aug 2005 03:23:36 +0000 (20:23 -0700)]
[NETFILTER]: more verbose return codes from nf_{log,queue}
This adds EEXIST to distinguish between the following return values:
0: nobody was registered, registration successful
EEXIST: the exact same handler was already registered, no registration
required
EBUSY: somebody else is registered, registration unsuccessful.
Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Harald Welte [Wed, 10 Aug 2005 03:23:11 +0000 (20:23 -0700)]
[NETFILTER]: add /proc/net/netfilter interface to nf_queue
This patch adds a /proc/net/netfilter/nf_queue file, similar to the
recently-added /proc/net/netfilter/nf_log. It indicates which queue
handler is registered to which protocol family. This is useful since
there are now multiple queue handlers in the treee (ip[6]_queue,
nfnetlink_queue).
Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Harald Welte [Wed, 10 Aug 2005 03:21:49 +0000 (20:21 -0700)]
[NETFILTER]: split net/core/netfilter.c into net/netfilter/*.c
This patch doesn't introduce any code changes, but merely splits the
core netfilter code into four separate files. It also moves it from
it's old location in net/core/ to the recently-created net/netfilter/
directory.
Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Harald Welte [Wed, 10 Aug 2005 03:20:54 +0000 (20:20 -0700)]
[NETFILTER]: ip{6}_queue: prevent unregistration race with nfnetlink_queue
Since nfnetlink_queue can override ip{6}_queue as queue handlers, we
can no longer blindly unregister whoever is registered for PF_INET[6],
but only unregister ourselves.
Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Harald Welte [Wed, 10 Aug 2005 03:19:44 +0000 (20:19 -0700)]
[NETFILTER]: move conntrack helper buffers from BSS to kmalloc()ed memory
According to DaveM, it is preferrable to have large data structures be
allocated dynamically from the module init() function rather than
putting them as static global variables into BSS.
This patch moves the conntrack helper packet buffers into dynamically
allocated memory.
Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Wed, 10 Aug 2005 03:17:41 +0000 (20:17 -0700)]
[TG3]: Fix bug in setting a tg3_flag
Found a bug while reviewing the patches the second time.
The TG3_FLAG_TXD_MBOX_HWBUG flag is set after the register access
methods have been determined. This patch fixes it by moving it up before
the various access methods are assigned.
Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Wed, 10 Aug 2005 03:17:28 +0000 (20:17 -0700)]
[TG3]: Eliminate one register write in tg3_restart_ints()
The register write to register 0x68 to restart interrupts is unnecessary
as the interrupt wasn't masked in that register by the irq handler. This
will save one register write in the fast path.
Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Wed, 10 Aug 2005 03:17:14 +0000 (20:17 -0700)]
[TG3]: Add indirect register method for 5703 behind ICH
This patch adds the new workaround for 5703 A1/A2 if it is behind
certain ICH bridges. The workaround disables memory and uses config.
cycles only to access all registers. The 5702/03 chips can mistakenly
decode the special cycles from the ICH chipsets as memory write cycles,
causing corruption of register and memory space. Only certain ICH
bridges will drive special cycles with non-zero data during the address
phase which can fall within the 5703's address range. This is not an ICH
bug as the PCI spec allows non-zero address during special cycles.
However, only these ICH bridges are known to drive non-zero addresses
during special cycles.
The indirect_lock is also changed to spin_lock_irqsave from spin_lock_bh
because it is used in irq handler when using the indirect method to
disable interrupts.
Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Wed, 10 Aug 2005 03:16:46 +0000 (20:16 -0700)]
[TG3]: Add various register methods
This patch adds various dedicated register read/write methods for the
existing workarounds, including PCIX target workaround, write with read
flush, etc. The chips that require these workarounds will use these
dedicated access functions.
Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Wed, 10 Aug 2005 03:16:32 +0000 (20:16 -0700)]
[TG3]: Add basic register access function pointers
This patch adds the basic function pointers to do register accesses in
the fast path. This was suggested by David Miller. The idea is that
various register access methods for different hardware errata can easily
be implemented with these function pointers and performance will not be
degraded on chips that use normal register access methods.
The various register read write macros (e.g. tw32, tr32, tw32_mailbox)
are redefined to call the function pointers.
Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Yoshifumi Nishida <nishida@csl.sony.co.jp> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>
[ICSK]: Move generalised functions from tcp to inet_connection_sock
This also improves reqsk_queue_prune and renames it to
inet_csk_reqsk_queue_prune, as it deals with both inet_connection_sock
and inet_request_sock objects, not just with request_sock ones thus
belonging to inet_request_sock.
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>
[ICSK]: Introduce reqsk_queue_prune from code in tcp_synack_timer
With this we're very close to getting all of the current TCP
refactorings in my dccp-2.6 tree merged, next changeset will export
some functions needed by the current DCCP code and then dccp-2.6.git
will be born!
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>
This also moved inet_iif from tcp to inet_hashtables.h, as it is
needed by the inet_lookup callers, perhaps this needs a bit of
polishing, but for now seems fine.
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>
[NET]: Just move the inet_connection_sock function from tcp sources
Completing the previous changeset, this also generalises tcp_v4_synq_add,
renaming it to inet_csk_reqsk_queue_hash_add, already geing used in the
DCCP tree, which I plan to merge RSN.
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>
This creates struct inet_connection_sock, moving members out of struct
tcp_sock that are shareable with other INET connection oriented
protocols, such as DCCP, that in my private tree already uses most of
these members.
The functions that operate on these members were renamed, using a
inet_csk_ prefix while not being moved yet to a new file, so as to
ease the review of these changes.
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>
This paves the way to generalise the rest of the sock ID lookup
routines and saves some bytes in TCPv4 TIME_WAIT sockets on distro
kernels (where IPv6 is always built as a module):
Now if a protocol wants to use the TIME_WAIT generic infrastructure it
only has to set the sk_prot->twsk_obj_size field with the size of its
inet_timewait_sock derived sock and proto_register will create
sk_prot->twsk_slab, for now its only for INET sockets, but we can
introduce timewait_sock later if some non INET transport protocolo
wants to use this stuff.
Next changesets will take advantage of this new infrastructure to
generalise even more TCP code.
It really just makes the existing code be a helper function that
tcp_v4_hash and tcp_unhash uses, specifying the right inet_hashinfo,
tcp_hashinfo.
One thing I'll investigate at some point is to have the inet_hashinfo
pointer in sk_prot, so that we get all the hashtable information from
the sk pointer, this can lead to some extra indirections that may well
hurt performance/code size, we'll see. Ultimate idea would be that
sk_prot would provide _all_ the information about a protocol
implementation.
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Also expose all of the tcp_hashinfo members, i.e. killing those
tcp_ehash, etc macros, this will more clearly expose already generic
functions and some that need just a bit of work to become generic, as
we'll see in the upcoming changesets.
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>
[NETFILTER]: fix list traversal order in ctnetlink
Currently conntracks are inserted after the head. That means that
conntracks are sorted from the biggest to the smallest id. This happens
because we use list_prepend (list_add) instead list_add_tail. This can
result in problems during the list iteration.
list_for_each(i, &ip_conntrack_hash[cb->args[0]]) {
h = (struct ip_conntrack_tuple_hash *) i;
if (DIRECTION(h) != IP_CT_DIR_ORIGINAL)
continue;
ct = tuplehash_to_ctrack(h);
if (ct->id <= *id)
continue;
In that case just the first conntrack in the bucket will be dumped. To
fix this, we iterate the list from the tail to the head via
list_for_each_prev. Same thing for the list of expectations.
Signed-off-by: Pablo Neira Ayuso <pablo@eurodev.net> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
[NETFILTER]: Fix typo in ctnl_exp_cb array (no bug, just memory waste)
This fixes the size of the ctnl_exp_cb array that is IPCTNL_MSG_EXP_MAX
instead of IPCTNL_MSG_MAX. Simple typo.
Signed-off-by: Pablo Neira Ayuso <pablo@eurodev.net> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
[NETFILTER]: fix conntrack refcount leak in unlink_expect()
In unlink_expect(), the expectation is removed from the list so the
refcount must be dropped as well.
Signed-off-by: Pablo Neira Ayuso <pablo@eurodev.net> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
[NETFILTER]: ctnetlink: make sure event order is correct
The following sequence is displayed during events dumping of an ICMP
connection: [NEW] [DESTROY] [UPDATE]
This happens because the event IPCT_DESTROY is delivered in
death_by_timeout(), that is called from the icmp protocol helper
(ct->timeout.function) once we see the reply.
To fix this, we move this event to destroy_conntrack().
Signed-off-by: Pablo Neira Ayuso <pablo@eurodev.net> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Harald Welte [Wed, 10 Aug 2005 03:04:07 +0000 (20:04 -0700)]
[NETFILTER]: don't use nested attributes for conntrack_expect
We used to use nested nfattr structures for ip_conntrack_expect. This is
bogus, since ip_conntrack and ip_conntrack_expect are communicated in
different netlink message types. both should be encoded at the top level
attributes, no extra nesting required. This patch addresses the issue.
Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Harald Welte [Wed, 10 Aug 2005 03:03:54 +0000 (20:03 -0700)]
[NETFILTER]: cleanup nfnetlink_check_attributes()
1) memset return parameter 'cda' (nfattr pointer array) only on success
2) a message without attributes and just a 'struct nfgenmsg' is valid,
don't return -EINVAL
3) use likely() and unlikely() where apropriate
Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Harald Welte [Wed, 10 Aug 2005 03:03:40 +0000 (20:03 -0700)]
[NETFILTER]: attribute count is an attribute of message type, not subsytem
Prior to this patch, every nfnetlink subsystem had to specify it's
attribute count. However, in reality the attribute count depends on
the message type within the subsystem, not the subsystem itself. This
patch moves 'attr_count' from 'struct nfnetlink_subsys' into
nfnl_callback to fix this.
Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
There was a stupid copy+paste mistake where we parse the MASK nfattr into
the "tuple" variable instead of the "mask" variable. This patch fixes it.
Thanks to Pablo Neira.
Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira [Wed, 10 Aug 2005 03:02:55 +0000 (20:02 -0700)]
[NETFILTER]: conntrack_netlink: Fix locking during conntrack_create
The current codepath allowed for ip_conntrack_lock to be unlock'ed twice.
Signed-off-by: Pablo Neira <pablo@eurodev.net> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira [Wed, 10 Aug 2005 03:02:36 +0000 (20:02 -0700)]
[NETFILTER]: remove bogus memset() calls from ip_conntrack_netlink.c
nfattr_parse_nested() calls nfattr_parse() which in turn does a memset
on the 'tb' array. All callers therefore don't need to memset before
calling it.
Signed-off-by: Pablo Neira <pablo@eurodev.net> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Wed, 10 Aug 2005 03:02:13 +0000 (20:02 -0700)]
[NETFILTER]: Fix multiple problems with the conntrack event cache
refcnt underflow: the reference count is decremented when a conntrack
entry is removed from the hash but it is not incremented when entering
new entries.
missing protection of process context against softirq context: all
cache operations need to locally disable softirqs to avoid races.
Additionally the event cache can't be initialized when a packet
enteres the conntrack code but needs to be initialized whenever we
cache an event and the stored conntrack entry doesn't match the
current one.
incorrect flushing of the event cache in ip_ct_iterate_cleanup:
without real locking we can't flush the cache for different CPUs
without incurring races. The cache for different CPUs can only be
flushed when no packets are going through the
code. ip_ct_iterate_cleanup doesn't need to drop all references, so
flushing is moved to the cleanup path.
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
This should really be in a inet_connection_sock, but I'm leaving it
for a later optimization, when some more fields common to INET
transport protocols now in tcp_sk or inet_sk will be chunked out into
inet_connection_sock, for now its better to concentrate on getting the
changes in the core merged to leave the DCCP tree with only DCCP
specific code.
Next changesets will take advantage of this move to generalise things
like tcp_bind_hash, tcp_put_port, tcp_inherit_port, making the later
receive a inet_hashinfo parameter, and even __tcp_tw_hashdance, etc in
the future, when tcp_tw_bucket gets transformed into the struct
timewait_sock hierarchy.
tcp_destroy_sock also is eligible as soon as tcp_orphan_count gets
moved to sk_prot.
A cascade of incremental changes will ultimately make the tcp_lookup
functions be fully generic.
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>
[INET]: Just rename the TCP hashtable functions/structs to inet_
This is to break down the complexity of the series of patches,
making it very clear that this one just does:
1. renames tcp_ prefixed hashtable functions and data structures that
were already mostly generic to inet_ to share it with DCCP and
other INET transport protocols.
2. Removes not used functions (__tb_head & tb_head)
3. Removes some leftover prototypes in the headers (tcp_bucket_unlock &
tcp_v4_build_header)
Next changesets will move tcp_sk(sk)->bind_hash to inet_sock so that we can
make functions such as tcp_inherit_port, __tcp_inherit_port, tcp_v4_get_port,
__tcp_put_port, generic and get others like tcp_destroy_sock closer to generic
(tcp_orphan_count will go to sk->sk_prot to allow this).
Eventually most of these functions will be used passing the transport protocol
inet_hashinfo structure.
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Harald Welte [Wed, 10 Aug 2005 02:58:39 +0000 (19:58 -0700)]
[NETFILTER]: Add new "nfnetlink_log" userspace packet logging facility
This is a generic (layer3 independent) version of what ipt_ULOG is already
doing for IPv4 today. ipt_ULOG, ebt_ulog and finally also ip[6]t_LOG will
be deprecated by this mechanism in the long term.
Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Harald Welte [Wed, 10 Aug 2005 02:58:27 +0000 (19:58 -0700)]
[NETFILTER]: Extend netfilter logging API
This patch is in preparation to nfnetlink_log:
- loggers now have to register struct nf_logger instead of nf_logfn
- nf_log_unregister() replaced by nf_log_unregister_pf() and
nf_log_unregister_logger()
- add comment to ip[6]t_LOG.h to assure nobody redefines flags
- add /proc/net/netfilter/nf_log to tell user which logger is currently
registered for which address family
- if user has configured logging, but no logging backend (logger) is
available, always spit a message to syslog, not just the first time.
- split ip[6]t_LOG.c into two parts:
Backend: Always try to register as logger for the respective address family
Frontend: Always log via nf_log_packet() API
- modify all users of nf_log_packet() to accomodate additional argument
Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
From tcp_v4_rebuild_header, that already was pretty generic, I only
needed to use sk->sk_protocol instead of the hardcoded IPPROTO_TCP and
establish the requirement that INET transport layer protocols that
want to use this function map TCP_SYN_SENT to its equivalent state.
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>
From tcp_v4_setup_caps, that always is preceded by a call to
__sk_dst_set, so coalesce this sequence into sk_setup_caps, removing
one call to a TCP function in the IP layer.
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew McDonald [Wed, 10 Aug 2005 02:44:42 +0000 (19:44 -0700)]
[IPV6]: Check interface bindings on IPv6 raw socket reception
Take account of whether a socket is bound to a particular device when
selecting an IPv6 raw socket to receive a packet. Also perform this
check when receiving IPv6 packets with router alert options.
Signed-off-by: Andrew McDonald <andrew@mcdonald.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
Harald Welte [Wed, 10 Aug 2005 02:44:15 +0000 (19:44 -0700)]
[NETFILTER]: Add "nfnetlink_queue" netfilter queue handler over nfnetlink
- Add new nfnetlink_queue module
- Add new ipt_NFQUEUE and ip6t_NFQUEUE modules to access queue numbers 1-65535
- Mark ip_queue and ip6_queue Kconfig options as OBSOLETE
- Update feature-removal-schedule to remove ip[6]_queue in December
Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Harald Welte [Wed, 10 Aug 2005 02:43:44 +0000 (19:43 -0700)]
[NETFILTER]: Core changes required by upcoming nfnetlink_queue code
- split netfiler verdict in 16bit verdict and 16bit queue number
- add 'queuenum' argument to nf_queue_outfn_t and its users ip[6]_queue
- move NFNL_SUBSYS_ definitions from enum to #define
- introduce autoloading for nfnetlink subsystem modules
- add MODULE_ALIAS_NFNL_SUBSYS macro
- add nf_unregister_queue_handlers() to register all handlers for a given
nf_queue_outfn_t
- add more verbose DEBUGP macro definition to nfnetlink.c
- make nfnetlink_subsys_register fail if subsys already exists
- add some more comments and debug statements to nfnetlink.c
Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Harald Welte [Wed, 10 Aug 2005 02:40:55 +0000 (19:40 -0700)]
[NETLINK]: Add properly module refcounting for kernel netlink sockets.
- Remove bogus code for compiling netlink as module
- Add module refcounting support for modules implementing a netlink
protocol
- Add support for autoloading modules that implement a netlink protocol
as soon as someone opens a socket for that protocol
Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Harald Welte [Wed, 10 Aug 2005 02:39:00 +0000 (19:39 -0700)]
[NETFILTER]: Move ipv4 specific code from net/core/netfilter.c to net/ipv4/netfilter.c
Netfilter cleanup
- Move ipv4 code from net/core/netfilter.c to net/ipv4/netfilter.c
- Move ipv6 netfilter code from net/ipv6/ip6_output.c to net/ipv6/netfilter.c
Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Adrian Bunk [Wed, 10 Aug 2005 02:35:47 +0000 (19:35 -0700)]
[IPV4]: possible cleanups
This patch contains the following possible cleanups:
- make needlessly global code static
- #if 0 the following unused global function:
- xfrm4_state.c: xfrm4_state_fini
- remove the following unneeded EXPORT_SYMBOL's:
- ip_output.c: ip_finish_output
- ip_output.c: sysctl_ip_default_ttl
- fib_frontend.c: ip_dev_find
- inetpeer.c: inet_peer_idlock
- ip_options.c: ip_options_compile
- ip_options.c: ip_options_undo
- net/core/request_sock.c: sysctl_max_syn_backlog
Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Harald Welte [Wed, 10 Aug 2005 02:32:58 +0000 (19:32 -0700)]
[NETFILTER]: Add ctnetlink subsystem
Add ctnetlink subsystem for userspace-access to ip_conntrack table.
This allows reading and updating of existing entries, as well as
creating new ones (and new expect's) via nfnetlink.
Please note the 'strange' byte order: nfattr (tag+length) are in host
byte order, while the payload is always guaranteed to be in network
byte order. This allows a simple userspace process to encapsulate netlink
messages into arch-independent udp packets by just processing/swapping the
headers and not knowing anything about the actual payload.
Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
This removes the private element from skbuff, that is only used by
HIPPI. Instead it uses skb->cb[] to hold the additional data that is
needed in the output path from hard_header to device driver.
PS: The only qdisc that might potentially corrupt this cb[] is if
netem was used over HIPPI. I will take care of that by fixing netem
to use skb->stamp. I don't expect many users of netem over HIPPI
Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Harald Welte [Wed, 10 Aug 2005 02:30:24 +0000 (19:30 -0700)]
[NETFITLER]: Add nfnetlink layer.
Introduce "nfnetlink" (netfilter netlink) layer. This layer is used as
transport layer for all userspace communication of the new upcoming
netfilter subsystems, such as ctnetlink, nfnetlink_queue and some day even
the mythical pkttables ;)
Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Harald Welte [Wed, 10 Aug 2005 02:28:03 +0000 (19:28 -0700)]
[NETFILTER]: connection tracking event notifiers
This adds a notifier chain based event mechanism for ip_conntrack state
changes. As opposed to the previous implementations in patch-o-matic, we
do no longer need a field in the skb to achieve this.
Thanks to the valuable input from Patrick McHardy and Rusty on the idea
of a per_cpu implementation.
Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 10 Aug 2005 02:25:21 +0000 (19:25 -0700)]
[NET]: Kill skb->list
Remove the "list" member of struct sk_buff, as it is entirely
redundant. All SKB list removal callers know which list the
SKB is on, so storing this in sk_buff does nothing other than
taking up some space.
Two tricky bits were SCTP, which I took care of, and two ATM
drivers which Francois Romieu <romieu@fr.zoreil.com> fixed
up.
Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Harald Welte [Wed, 10 Aug 2005 02:24:19 +0000 (19:24 -0700)]
[NETFILTER]: reduce netfilter sk_buff enlargement
As discussed at netconf'05, we're trying to save every bit in sk_buff.
The patch below makes sk_buff 8 bytes smaller. I did some basic
testing on my notebook and it seems to work.
The only real in-tree user of nfcache was IPVS, who only needs a
single bit. Unfortunately I couldn't find some other free bit in
sk_buff to stuff that bit into, so I introduced a separate field for
them. Maybe the IPVS guys can resolve that to further save space.
Initially I wanted to shrink pkt_type to three bits (PACKET_HOST and
alike are only 6 values defined), but unfortunately the bluetooth code
overloads pkt_type :(
The conntrack-event-api (out-of-tree) uses nfcache, but Rusty just
came up with a way how to do it without any skb fields, so it's safe
to remove it.
- remove all never-implemented 'nfcache' code
- don't have ipvs code abuse 'nfcache' field. currently get's their own
compile-conditional skb->ipvs_property field. IPVS maintainers can
decide to move this bit elswhere, but nfcache needs to die.
- remove skb->nfcache field to save 4 bytes
- move skb->nfctinfo into three unused bits to save further 4 bytes
Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 29 Aug 2005 19:46:07 +0000 (12:46 -0700)]
[SPARC64]: Make debugging spinlocks usable again.
When the spinlock routines were moved out of line into
kernel/spinlock.c this made it so that the debugging
spinlocks record lock acquisition program counts in the
kernel/spinlock.c functions not in their callers.
This makes the debugging info kind of useless.
So record the correct caller's program counter and
now this feature is useful once more.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 29 Aug 2005 19:45:11 +0000 (12:45 -0700)]
[SPARC64]: Revamp Spitfire error trap handling.
Current uncorrectable error handling was poor enough
that the processor could just loop taking the same
trap over and over again. Fix things up so that we
at least get a log message and perhaps even some register
state.
In the process, much consolidation became possible,
particularly with the correctable error handler.
Prefix assembler and C function names with "spitfire"
to indicate that these are for Ultra-I/II/IIi/IIe only.
More work is needed to make these routines robust and
featureful to the level of the Ultra-III error handlers.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 29 Aug 2005 19:44:40 +0000 (12:44 -0700)]
[SPARC64]: Fix trap state reading for instruction_access_exception.
1) Read ASI_IMMU SFSR not ASI_DMMU.
2) IMMU has no SFAR, read TPC instead
3) Delete old and incorrect comment about the DTLB protection
trap having a dependency on the SFSR contents in order to
function correctly
Signed-off-by: David S. Miller <davem@davemloft.net>