Sign Up
Log In
Log In
or
Sign Up
Places
All Projects
Status Monitor
Collapse sidebar
openSUSE:Leap:15.2:Rings:1-MinimalX
rdma-core
Revert-libcxgb3-Remove-libcxgb3-from-rdma-core....
Overview
Repositories
Revisions
Requests
Users
Attributes
Meta
File Revert-libcxgb3-Remove-libcxgb3-from-rdma-core.patch of Package rdma-core
commit 46e6822db968a7749dd20269afd30e1793936498 Author: Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com> Date: Fri Jan 10 09:07:51 2020 +0100 Revert "libcxgb3: Remove libcxgb3 from rdma-core" This reverts commit 36588f5844af4ef1e5b0d6ad002fa1adf9032653. diff --git CMakeLists.txt CMakeLists.txt index b3ff156956a0..5adddc0160d3 100644 --- CMakeLists.txt +++ CMakeLists.txt @@ -615,6 +615,7 @@ add_subdirectory(librdmacm/man) # Providers if (HAVE_COHERENT_DMA) add_subdirectory(providers/bnxt_re) +add_subdirectory(providers/cxgb3) # NO SPARSE add_subdirectory(providers/cxgb4) # NO SPARSE add_subdirectory(providers/efa) add_subdirectory(providers/efa/man) diff --git MAINTAINERS MAINTAINERS index 8030891d96a4..e47b97b2ff70 100644 --- MAINTAINERS +++ MAINTAINERS @@ -51,6 +51,11 @@ M: Devesh Sharma <Devesh.sharma@broadcom.com> S: Supported F: providers/bnxt_re/ +CXGB3 USERSPACE PROVIDER (for iw_cxgb3.ko) +M: Steve Wise <swise@opengridcomputing.com> +S: Supported +F: providers/cxgb3/ + CXGB4 USERSPACE PROVIDER (for iw_cxgb4.ko) M: Steve Wise <swise@opengridcomputing.com> S: Supported diff --git README.md README.md index 24eee90cb8b7..565b97a908dd 100644 --- README.md +++ README.md @@ -15,6 +15,7 @@ under the providers/ directory. Support for the following Kernel RDMA drivers is included: - efa.ko + - iw_cxgb3.ko - iw_cxgb4.ko - hfi1.ko - hns-roce.ko diff --git debian/control debian/control index 738b2d6da39d..1e002808cc40 100644 --- debian/control +++ debian/control @@ -93,6 +93,7 @@ Description: User space provider drivers for libibverbs This package contains the user space verbs drivers: . - bnxt_re: Broadcom NetXtreme-E RoCE HCAs + - cxgb3: Chelsio T3 iWARP HCAs - cxgb4: Chelsio T4 iWARP HCAs - efa: Amazon Elastic Fabric Adapter - hfi1verbs: Intel Omni-Path HFI diff --git debian/copyright debian/copyright index db4951993bd8..c6d798d4c30e 100644 --- debian/copyright +++ debian/copyright @@ -148,7 +148,8 @@ Files: providers/bnxt_re/* Copyright: 2015-2017, Broadcom Limited and/or its subsidiaries License: BSD-2-clause or GPL-2 -Files: providers/cxgb4/* +Files: providers/cxgb3/* + providers/cxgb4/* Copyright: 2003-2016, Chelsio Communications, Inc. License: BSD-MIT or GPL-2 diff --git kernel-boot/rdma-description.rules kernel-boot/rdma-description.rules index 4ea59ba1977b..bb33dce40bd6 100644 --- kernel-boot/rdma-description.rules +++ kernel-boot/rdma-description.rules @@ -22,6 +22,7 @@ DRIVERS=="ib_qib", ENV{ID_RDMA_INFINIBAND}="1" DRIVERS=="hfi1", ENV{ID_RDMA_OPA}="1" # Hardware that supports iWarp +DRIVERS=="cxgb3", ENV{ID_RDMA_IWARP}="1" DRIVERS=="cxgb4", ENV{ID_RDMA_IWARP}="1" DRIVERS=="i40e", ENV{ID_RDMA_IWARP}="1" DRIVERS=="nes", ENV{ID_RDMA_IWARP}="1" diff --git kernel-boot/rdma-hw-modules.rules kernel-boot/rdma-hw-modules.rules index da4bbe363ac4..dde0ab8dacac 100644 --- kernel-boot/rdma-hw-modules.rules +++ kernel-boot/rdma-hw-modules.rules @@ -8,6 +8,7 @@ SUBSYSTEM!="net", GOTO="rdma_hw_modules_end" # RDMA. ENV{ID_NET_DRIVER}=="be2net", RUN{builtin}+="kmod load ocrdma" ENV{ID_NET_DRIVER}=="bnxt_en", RUN{builtin}+="kmod load bnxt_re" +ENV{ID_NET_DRIVER}=="cxgb3", RUN{builtin}+="kmod load iw_cxgb3" ENV{ID_NET_DRIVER}=="cxgb4", RUN{builtin}+="kmod load iw_cxgb4" ENV{ID_NET_DRIVER}=="hns", RUN{builtin}+="kmod load hns_roce" ENV{ID_NET_DRIVER}=="i40e", RUN{builtin}+="kmod load i40iw" diff --git libibverbs/verbs.h libibverbs/verbs.h index d873f6d07327..8b580d101ce2 100644 --- libibverbs/verbs.h +++ libibverbs/verbs.h @@ -2156,6 +2156,7 @@ struct ibv_device **ibv_get_device_list(int *num_devices); struct verbs_devices_ops; extern const struct verbs_device_ops verbs_provider_bnxt_re; +extern const struct verbs_device_ops verbs_provider_cxgb3; extern const struct verbs_device_ops verbs_provider_cxgb4; extern const struct verbs_device_ops verbs_provider_efa; extern const struct verbs_device_ops verbs_provider_hfi1verbs; diff --git providers/cxgb3/CMakeLists.txt providers/cxgb3/CMakeLists.txt new file mode 100644 index 000000000000..a578105e7b28 --- /dev/null +++ providers/cxgb3/CMakeLists.txt @@ -0,0 +1,6 @@ +rdma_provider(cxgb3 + cq.c + iwch.c + qp.c + verbs.c +) diff --git providers/cxgb3/cq.c providers/cxgb3/cq.c new file mode 100644 index 000000000000..6cb4fe74d064 --- /dev/null +++ providers/cxgb3/cq.c @@ -0,0 +1,442 @@ +/* + * Copyright (c) 2006-2007 Chelsio, Inc. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + */ +#include <config.h> + +#include <stdio.h> +#include <pthread.h> +#include <sys/errno.h> + +#include <infiniband/opcode.h> + +#include "iwch.h" +#include "iwch-abi.h" + +int iwch_arm_cq(struct ibv_cq *ibcq, int solicited) +{ + int ret; + struct iwch_cq *chp = to_iwch_cq(ibcq); + + pthread_spin_lock(&chp->lock); + ret = ibv_cmd_req_notify_cq(ibcq, solicited); + pthread_spin_unlock(&chp->lock); + + return ret; +} + +static inline void flush_completed_wrs(struct t3_wq *wq, struct t3_cq *cq) +{ + struct t3_swsq *sqp; + uint32_t ptr = wq->sq_rptr; + int count = Q_COUNT(wq->sq_rptr, wq->sq_wptr); + + sqp = wq->sq + Q_PTR2IDX(ptr, wq->sq_size_log2); + while (count--) { + if (!sqp->signaled) { + ptr++; + sqp = wq->sq + Q_PTR2IDX(ptr, wq->sq_size_log2); + } else if (sqp->complete) { + + /* + * Insert this completed cqe into the swcq. + */ + sqp->cqe.header |= htobe32(V_CQE_SWCQE(1)); + *(cq->sw_queue + Q_PTR2IDX(cq->sw_wptr, cq->size_log2)) + = sqp->cqe; + cq->sw_wptr++; + sqp->signaled = 0; + break; + } else + break; + } +} + +static inline void create_read_req_cqe(struct t3_wq *wq, + struct t3_cqe *hw_cqe, + struct t3_cqe *read_cqe) +{ + CQE_WRID_SQ_WPTR(*read_cqe) = wq->oldest_read->sq_wptr; + read_cqe->len = wq->oldest_read->read_len; + read_cqe->header = htobe32(V_CQE_QPID(CQE_QPID(*hw_cqe)) | + V_CQE_SWCQE(SW_CQE(*hw_cqe)) | + V_CQE_OPCODE(T3_READ_REQ) | + V_CQE_TYPE(1)); +} + +/* + * Return a ptr to the next read wr in the SWSQ or NULL. + */ +static inline void advance_oldest_read(struct t3_wq *wq) +{ + + uint32_t rptr = wq->oldest_read - wq->sq + 1; + uint32_t wptr = Q_PTR2IDX(wq->sq_wptr, wq->sq_size_log2); + + while (Q_PTR2IDX(rptr, wq->sq_size_log2) != wptr) { + wq->oldest_read = wq->sq + Q_PTR2IDX(rptr, wq->sq_size_log2); + + if (wq->oldest_read->opcode == T3_READ_REQ) { + return; + } + rptr++; + } + wq->oldest_read = NULL; +} + +static inline int cxio_poll_cq(struct t3_wq *wq, struct t3_cq *cq, + struct t3_cqe *cqe, uint8_t *cqe_flushed, + uint64_t *cookie) +{ + int ret = 0; + struct t3_cqe *hw_cqe, read_cqe; + + *cqe_flushed = 0; + hw_cqe = cxio_next_cqe(cq); + udma_from_device_barrier(); + + /* + * Skip cqes not affiliated with a QP. + */ + if (wq == NULL) { + ret = -1; + goto skip_cqe; + } + + /* + * Gotta tweak READ completions: + * 1) the cqe doesn't contain the sq_wptr from the wr. + * 2) opcode not reflected from the wr. + * 3) read_len not reflected from the wr. + * 4) cq_type is RQ_TYPE not SQ_TYPE. + */ + if (CQE_OPCODE(*hw_cqe) == T3_READ_RESP) { + + /* + * If this is an unsolicited read response to local stag 1, + * then the read was generated by the kernel driver as part + * of peer-2-peer connection setup. So ignore the completion. + */ + if (CQE_WRID_STAG(*hw_cqe) == 1) { + if (CQE_STATUS(*hw_cqe)) + wq->error = 1; + ret = -1; + goto skip_cqe; + } + + /* + * Don't write to the HWCQ, so create a new read req CQE + * in local memory. + */ + create_read_req_cqe(wq, hw_cqe, &read_cqe); + hw_cqe = &read_cqe; + advance_oldest_read(wq); + } + + /* + * Errors. + */ + if (CQE_STATUS(*hw_cqe) || t3_wq_in_error(wq)) { + *cqe_flushed = t3_wq_in_error(wq); + t3_set_wq_in_error(wq); + goto proc_cqe; + } + + /* + * RECV completion. + */ + if (RQ_TYPE(*hw_cqe)) { + + /* + * HW only validates 4 bits of MSN. So we must validate that + * the MSN in the SEND is the next expected MSN. If its not, + * then we complete this with TPT_ERR_MSN and mark the wq in + * error. + */ + if ((CQE_WRID_MSN(*hw_cqe) != (wq->rq_rptr + 1))) { + t3_set_wq_in_error(wq); + hw_cqe->header |= htobe32(V_CQE_STATUS(TPT_ERR_MSN)); + } + goto proc_cqe; + } + + /* + * If we get here its a send completion. + * + * Handle out of order completion. These get stuffed + * in the SW SQ. Then the SW SQ is walked to move any + * now in-order completions into the SW CQ. This handles + * 2 cases: + * 1) reaping unsignaled WRs when the first subsequent + * signaled WR is completed. + * 2) out of order read completions. + */ + if (!SW_CQE(*hw_cqe) && (CQE_WRID_SQ_WPTR(*hw_cqe) != wq->sq_rptr)) { + struct t3_swsq *sqp; + + sqp = wq->sq + + Q_PTR2IDX(CQE_WRID_SQ_WPTR(*hw_cqe), wq->sq_size_log2); + sqp->cqe = *hw_cqe; + sqp->complete = 1; + ret = -1; + goto flush_wq; + } + +proc_cqe: + *cqe = *hw_cqe; + + /* + * Reap the associated WR(s) that are freed up with this + * completion. + */ + if (SQ_TYPE(*hw_cqe)) { + wq->sq_rptr = CQE_WRID_SQ_WPTR(*hw_cqe); + *cookie = (wq->sq + + Q_PTR2IDX(wq->sq_rptr, wq->sq_size_log2))->wr_id; + wq->sq_rptr++; + } else { + *cookie = *(wq->rq + Q_PTR2IDX(wq->rq_rptr, wq->rq_size_log2)); + wq->rq_rptr++; + } + +flush_wq: + /* + * Flush any completed cqes that are now in-order. + */ + flush_completed_wrs(wq, cq); + +skip_cqe: + if (SW_CQE(*hw_cqe)) { + PDBG("%s cq %p cqid 0x%x skip sw cqe sw_rptr 0x%x\n", + __FUNCTION__, cq, cq->cqid, cq->sw_rptr); + ++cq->sw_rptr; + } else { + PDBG("%s cq %p cqid 0x%x skip hw cqe sw_rptr 0x%x\n", + __FUNCTION__, cq, cq->cqid, cq->rptr); + ++cq->rptr; + } + + return ret; +} + +/* + * Get one cq entry from cxio and map it to openib. + * + * Returns: + * 0 EMPTY; + * 1 cqe returned + * -EAGAIN caller must try again + * any other -errno fatal error + */ +static int iwch_poll_cq_one(struct iwch_device *rhp, struct iwch_cq *chp, + struct ibv_wc *wc) +{ + struct iwch_qp *qhp = NULL; + struct t3_cqe cqe, *hw_cqe; + struct t3_wq *wq; + uint8_t cqe_flushed; + uint64_t cookie; + int ret = 1; + + hw_cqe = cxio_next_cqe(&chp->cq); + udma_from_device_barrier(); + + if (!hw_cqe) + return 0; + + qhp = rhp->qpid2ptr[CQE_QPID(*hw_cqe)]; + if (!qhp) + wq = NULL; + else { + pthread_spin_lock(&qhp->lock); + wq = &(qhp->wq); + } + ret = cxio_poll_cq(wq, &(chp->cq), &cqe, &cqe_flushed, &cookie); + if (ret) { + ret = -EAGAIN; + goto out; + } + ret = 1; + + wc->wr_id = cookie; + wc->qp_num = qhp->wq.qpid; + wc->vendor_err = CQE_STATUS(cqe); + wc->wc_flags = 0; + + PDBG("%s qpid 0x%x type %d opcode %d status 0x%x wrid hi 0x%x " + "lo 0x%x cookie 0x%" PRIx64 "\n", + __FUNCTION__, CQE_QPID(cqe), CQE_TYPE(cqe), + CQE_OPCODE(cqe), CQE_STATUS(cqe), CQE_WRID_HI(cqe), + CQE_WRID_LOW(cqe), cookie); + + if (CQE_TYPE(cqe) == 0) { + if (!CQE_STATUS(cqe)) + wc->byte_len = CQE_LEN(cqe); + else + wc->byte_len = 0; + wc->opcode = IBV_WC_RECV; + } else { + switch (CQE_OPCODE(cqe)) { + case T3_RDMA_WRITE: + wc->opcode = IBV_WC_RDMA_WRITE; + break; + case T3_READ_REQ: + wc->opcode = IBV_WC_RDMA_READ; + wc->byte_len = CQE_LEN(cqe); + break; + case T3_SEND: + case T3_SEND_WITH_SE: + wc->opcode = IBV_WC_SEND; + break; + case T3_BIND_MW: + wc->opcode = IBV_WC_BIND_MW; + break; + + /* these aren't supported yet */ + case T3_SEND_WITH_INV: + case T3_SEND_WITH_SE_INV: + case T3_LOCAL_INV: + case T3_FAST_REGISTER: + default: + PDBG("%s Unexpected opcode %d CQID 0x%x QPID 0x%x\n", + __FUNCTION__, CQE_OPCODE(cqe), chp->cq.cqid, + CQE_QPID(cqe)); + ret = -EINVAL; + goto out; + } + } + + if (cqe_flushed) { + wc->status = IBV_WC_WR_FLUSH_ERR; + } else { + + switch (CQE_STATUS(cqe)) { + case TPT_ERR_SUCCESS: + wc->status = IBV_WC_SUCCESS; + break; + case TPT_ERR_STAG: + wc->status = IBV_WC_LOC_ACCESS_ERR; + break; + case TPT_ERR_PDID: + wc->status = IBV_WC_LOC_PROT_ERR; + break; + case TPT_ERR_QPID: + case TPT_ERR_ACCESS: + wc->status = IBV_WC_LOC_ACCESS_ERR; + break; + case TPT_ERR_WRAP: + wc->status = IBV_WC_GENERAL_ERR; + break; + case TPT_ERR_BOUND: + wc->status = IBV_WC_LOC_LEN_ERR; + break; + case TPT_ERR_INVALIDATE_SHARED_MR: + case TPT_ERR_INVALIDATE_MR_WITH_MW_BOUND: + wc->status = IBV_WC_MW_BIND_ERR; + break; + case TPT_ERR_CRC: + case TPT_ERR_MARKER: + case TPT_ERR_PDU_LEN_ERR: + case TPT_ERR_OUT_OF_RQE: + case TPT_ERR_DDP_VERSION: + case TPT_ERR_RDMA_VERSION: + case TPT_ERR_DDP_QUEUE_NUM: + case TPT_ERR_MSN: + case TPT_ERR_TBIT: + case TPT_ERR_MO: + case TPT_ERR_MSN_RANGE: + case TPT_ERR_IRD_OVERFLOW: + case TPT_ERR_OPCODE: + wc->status = IBV_WC_FATAL_ERR; + break; + case TPT_ERR_SWFLUSH: + wc->status = IBV_WC_WR_FLUSH_ERR; + break; + default: + PDBG("%s Unexpected status 0x%x CQID 0x%x QPID 0x%0x\n", + __FUNCTION__, CQE_STATUS(cqe), chp->cq.cqid, + CQE_QPID(cqe)); + ret = -EINVAL; + } + } +out: + if (wq) + pthread_spin_unlock(&qhp->lock); + return ret; +} + +int t3b_poll_cq(struct ibv_cq *ibcq, int num_entries, struct ibv_wc *wc) +{ + struct iwch_device *rhp; + struct iwch_cq *chp; + int npolled; + int err = 0; + + chp = to_iwch_cq(ibcq); + rhp = chp->rhp; + + if (rhp->abi_version > 0 && t3_cq_in_error(&chp->cq)) { + t3_reset_cq_in_error(&chp->cq); + iwch_flush_qps(rhp); + } + + pthread_spin_lock(&chp->lock); + for (npolled = 0; npolled < num_entries; ++npolled) { + + /* + * Because T3 can post CQEs that are out of order, + * we might have to poll again after removing + * one of these. + */ + do { + err = iwch_poll_cq_one(rhp, chp, wc + npolled); + } while (err == -EAGAIN); + if (err <= 0) + break; + } + pthread_spin_unlock(&chp->lock); + + if (err < 0) + return err; + else { + return npolled; + } +} + +int t3a_poll_cq(struct ibv_cq *ibcq, int num_entries, struct ibv_wc *wc) +{ + int ret; + struct iwch_cq *chp = to_iwch_cq(ibcq); + + pthread_spin_lock(&chp->lock); + ret = ibv_cmd_poll_cq(ibcq, num_entries, wc); + pthread_spin_unlock(&chp->lock); + return ret; +} diff --git providers/cxgb3/cxio_wr.h providers/cxgb3/cxio_wr.h new file mode 100644 index 000000000000..042bd9414220 --- /dev/null +++ providers/cxgb3/cxio_wr.h @@ -0,0 +1,758 @@ +/* + * Copyright (c) 2006-2007 Chelsio, Inc. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + */ +#ifndef __CXIO_WR_H__ +#define __CXIO_WR_H__ + +#include <stddef.h> +#include <stdint.h> +#include <endian.h> +#include <util/udma_barrier.h> +#include "firmware_exports.h" + +#define T3_MAX_NUM_QP (1<<15) +#define T3_MAX_NUM_CQ (1<<15) +#define T3_MAX_NUM_PD (1<<15) +#define T3_MAX_NUM_STAG (1<<15) +#define T3_MAX_SGE 4 +#define T3_MAX_INLINE 64 + +#define Q_EMPTY(rptr,wptr) ((rptr)==(wptr)) +#define Q_FULL(rptr,wptr,size_log2) ( (((wptr)-(rptr))>>(size_log2)) && \ + ((rptr)!=(wptr)) ) +#define Q_GENBIT(ptr,size_log2) (!(((ptr)>>size_log2)&0x1)) +#define Q_FREECNT(rptr,wptr,size_log2) ((1UL<<size_log2)-((wptr)-(rptr))) +#define Q_COUNT(rptr,wptr) ((wptr)-(rptr)) +#define Q_PTR2IDX(ptr,size_log2) (ptr & ((1UL<<size_log2)-1)) + +/* FIXME: Move me to a generic PCI mmio accessor */ +#define cpu_to_pci32(val) htole32(val) + +#define RING_DOORBELL(doorbell, QPID) { \ + *doorbell = cpu_to_pci32(QPID); \ +} + +#define SEQ32_GE(x,y) (!( (((uint32_t) (x)) - ((uint32_t) (y))) & 0x80000000 )) + +enum t3_wr_flags { + T3_COMPLETION_FLAG = 0x01, + T3_NOTIFY_FLAG = 0x02, + T3_SOLICITED_EVENT_FLAG = 0x04, + T3_READ_FENCE_FLAG = 0x08, + T3_LOCAL_FENCE_FLAG = 0x10 +} __attribute__ ((packed)); + +enum t3_wr_opcode { + T3_WR_BP = FW_WROPCODE_RI_BYPASS, + T3_WR_SEND = FW_WROPCODE_RI_SEND, + T3_WR_WRITE = FW_WROPCODE_RI_RDMA_WRITE, + T3_WR_READ = FW_WROPCODE_RI_RDMA_READ, + T3_WR_INV_STAG = FW_WROPCODE_RI_LOCAL_INV, + T3_WR_BIND = FW_WROPCODE_RI_BIND_MW, + T3_WR_RCV = FW_WROPCODE_RI_RECEIVE, + T3_WR_INIT = FW_WROPCODE_RI_RDMA_INIT, + T3_WR_QP_MOD = FW_WROPCODE_RI_MODIFY_QP +} __attribute__ ((packed)); + +enum t3_rdma_opcode { + T3_RDMA_WRITE, /* IETF RDMAP v1.0 ... */ + T3_READ_REQ, + T3_READ_RESP, + T3_SEND, + T3_SEND_WITH_INV, + T3_SEND_WITH_SE, + T3_SEND_WITH_SE_INV, + T3_TERMINATE, + T3_RDMA_INIT, /* CHELSIO RI specific ... */ + T3_BIND_MW, + T3_FAST_REGISTER, + T3_LOCAL_INV, + T3_QP_MOD, + T3_BYPASS +} __attribute__ ((packed)); + +static inline enum t3_rdma_opcode wr2opcode(enum t3_wr_opcode wrop) +{ + switch (wrop) { + case T3_WR_BP: return T3_BYPASS; + case T3_WR_SEND: return T3_SEND; + case T3_WR_WRITE: return T3_RDMA_WRITE; + case T3_WR_READ: return T3_READ_REQ; + case T3_WR_INV_STAG: return T3_LOCAL_INV; + case T3_WR_BIND: return T3_BIND_MW; + case T3_WR_INIT: return T3_RDMA_INIT; + case T3_WR_QP_MOD: return T3_QP_MOD; + default: break; + } + return -1; +} + + +/* Work request id */ +union t3_wrid { + struct { + uint32_t hi:32; + uint32_t low:32; + } id0; + uint64_t id1; +}; + +#define WRID(wrid) (wrid.id1) +#define WRID_GEN(wrid) (wrid.id0.wr_gen) +#define WRID_IDX(wrid) (wrid.id0.wr_idx) +#define WRID_LO(wrid) (wrid.id0.wr_lo) + +struct fw_riwrh { + uint32_t op_seop_flags; + uint32_t gen_tid_len; +}; + +#define S_FW_RIWR_OP 24 +#define M_FW_RIWR_OP 0xff +#define V_FW_RIWR_OP(x) ((x) << S_FW_RIWR_OP) +#define G_FW_RIWR_OP(x) ((((x) >> S_FW_RIWR_OP)) & M_FW_RIWR_OP) + +#define S_FW_RIWR_SOPEOP 22 +#define M_FW_RIWR_SOPEOP 0x3 +#define V_FW_RIWR_SOPEOP(x) ((x) << S_FW_RIWR_SOPEOP) + +#define S_FW_RIWR_FLAGS 8 +#define M_FW_RIWR_FLAGS 0x3fffff +#define V_FW_RIWR_FLAGS(x) ((x) << S_FW_RIWR_FLAGS) +#define G_FW_RIWR_FLAGS(x) ((((x) >> S_FW_RIWR_FLAGS)) & M_FW_RIWR_FLAGS) + +#define S_FW_RIWR_TID 8 +#define V_FW_RIWR_TID(x) ((x) << S_FW_RIWR_TID) + +#define S_FW_RIWR_LEN 0 +#define V_FW_RIWR_LEN(x) ((x) << S_FW_RIWR_LEN) + +#define S_FW_RIWR_GEN 31 +#define V_FW_RIWR_GEN(x) ((x) << S_FW_RIWR_GEN) + +struct t3_sge { + uint32_t stag; + uint32_t len; + uint64_t to; +}; + +/* If num_sgle is zero, flit 5+ contains immediate data.*/ +struct t3_send_wr { + struct fw_riwrh wrh; /* 0 */ + union t3_wrid wrid; /* 1 */ + + enum t3_rdma_opcode rdmaop:8; + uint32_t reserved:24; /* 2 */ + uint32_t rem_stag; /* 2 */ + uint32_t plen; /* 3 */ + uint32_t num_sgle; + struct t3_sge sgl[T3_MAX_SGE]; /* 4+ */ +}; + +struct t3_local_inv_wr { + struct fw_riwrh wrh; /* 0 */ + union t3_wrid wrid; /* 1 */ + uint32_t stag; /* 2 */ + uint32_t reserved3; +}; + +struct t3_rdma_write_wr { + struct fw_riwrh wrh; /* 0 */ + union t3_wrid wrid; /* 1 */ + enum t3_rdma_opcode rdmaop:8; /* 2 */ + uint32_t reserved:24; /* 2 */ + uint32_t stag_sink; + uint64_t to_sink; /* 3 */ + uint32_t plen; /* 4 */ + uint32_t num_sgle; + struct t3_sge sgl[T3_MAX_SGE]; /* 5+ */ +}; + +struct t3_rdma_read_wr { + struct fw_riwrh wrh; /* 0 */ + union t3_wrid wrid; /* 1 */ + enum t3_rdma_opcode rdmaop:8; /* 2 */ + uint32_t reserved:24; + uint32_t rem_stag; + uint64_t rem_to; /* 3 */ + uint32_t local_stag; /* 4 */ + uint32_t local_len; + uint64_t local_to; /* 5 */ +}; + +enum t3_addr_type { + T3_VA_BASED_TO = 0x0, + T3_ZERO_BASED_TO = 0x1 +} __attribute__ ((packed)); + +enum t3_mem_perms { + T3_MEM_ACCESS_LOCAL_READ = 0x1, + T3_MEM_ACCESS_LOCAL_WRITE = 0x2, + T3_MEM_ACCESS_REM_READ = 0x4, + T3_MEM_ACCESS_REM_WRITE = 0x8 +} __attribute__ ((packed)); + +struct t3_bind_mw_wr { + struct fw_riwrh wrh; /* 0 */ + union t3_wrid wrid; /* 1 */ + uint32_t reserved:16; + enum t3_addr_type type:8; + enum t3_mem_perms perms:8; /* 2 */ + uint32_t mr_stag; + uint32_t mw_stag; /* 3 */ + uint32_t mw_len; + uint64_t mw_va; /* 4 */ + uint32_t mr_pbl_addr; /* 5 */ + uint32_t reserved2:24; + uint32_t mr_pagesz:8; +}; + +struct t3_receive_wr { + struct fw_riwrh wrh; /* 0 */ + union t3_wrid wrid; /* 1 */ + uint8_t pagesz[T3_MAX_SGE]; + uint32_t num_sgle; /* 2 */ + struct t3_sge sgl[T3_MAX_SGE]; /* 3+ */ + uint32_t pbl_addr[T3_MAX_SGE]; +}; + +struct t3_bypass_wr { + struct fw_riwrh wrh; + union t3_wrid wrid; /* 1 */ +}; + +struct t3_modify_qp_wr { + struct fw_riwrh wrh; /* 0 */ + union t3_wrid wrid; /* 1 */ + uint32_t flags; /* 2 */ + uint32_t quiesce; /* 2 */ + uint32_t max_ird; /* 3 */ + uint32_t max_ord; /* 3 */ + uint64_t sge_cmd; /* 4 */ + uint64_t ctx1; /* 5 */ + uint64_t ctx0; /* 6 */ +}; + +enum t3_modify_qp_flags { + MODQP_QUIESCE = 0x01, + MODQP_MAX_IRD = 0x02, + MODQP_MAX_ORD = 0x04, + MODQP_WRITE_EC = 0x08, + MODQP_READ_EC = 0x10, +}; + + +enum t3_mpa_attrs { + uP_RI_MPA_RX_MARKER_ENABLE = 0x1, + uP_RI_MPA_TX_MARKER_ENABLE = 0x2, + uP_RI_MPA_CRC_ENABLE = 0x4, + uP_RI_MPA_IETF_ENABLE = 0x8 +} __attribute__ ((packed)); + +enum t3_qp_caps { + uP_RI_QP_RDMA_READ_ENABLE = 0x01, + uP_RI_QP_RDMA_WRITE_ENABLE = 0x02, + uP_RI_QP_BIND_ENABLE = 0x04, + uP_RI_QP_FAST_REGISTER_ENABLE = 0x08, + uP_RI_QP_STAG0_ENABLE = 0x10 +} __attribute__ ((packed)); + +struct t3_rdma_init_attr { + uint32_t tid; + uint32_t qpid; + uint32_t pdid; + uint32_t scqid; + uint32_t rcqid; + uint32_t rq_addr; + uint32_t rq_size; + enum t3_mpa_attrs mpaattrs; + enum t3_qp_caps qpcaps; + uint16_t tcp_emss; + uint32_t ord; + uint32_t ird; + uint64_t qp_dma_addr; + uint32_t qp_dma_size; + uint8_t rqes_posted; +}; + +struct t3_rdma_init_wr { + struct fw_riwrh wrh; /* 0 */ + union t3_wrid wrid; /* 1 */ + uint32_t qpid; /* 2 */ + uint32_t pdid; + uint32_t scqid; /* 3 */ + uint32_t rcqid; + uint32_t rq_addr; /* 4 */ + uint32_t rq_size; + enum t3_mpa_attrs mpaattrs:8; /* 5 */ + enum t3_qp_caps qpcaps:8; + uint32_t ulpdu_size:16; + uint32_t rqes_posted; /* bits 31-1 - reservered */ + /* bit 0 - set if RECV posted */ + uint32_t ord; /* 6 */ + uint32_t ird; + uint64_t qp_dma_addr; /* 7 */ + uint32_t qp_dma_size; /* 8 */ + uint32_t rsvd; +}; + +union t3_wr { + struct t3_send_wr send; + struct t3_rdma_write_wr write; + struct t3_rdma_read_wr read; + struct t3_receive_wr recv; + struct t3_local_inv_wr local_inv; + struct t3_bind_mw_wr bind; + struct t3_bypass_wr bypass; + struct t3_rdma_init_wr init; + struct t3_modify_qp_wr qp_mod; + uint64_t flit[16]; +}; + +#define T3_SQ_CQE_FLIT 13 +#define T3_SQ_COOKIE_FLIT 14 + +#define T3_RQ_COOKIE_FLIT 13 +#define T3_RQ_CQE_FLIT 14 + +static inline void build_fw_riwrh(struct fw_riwrh *wqe, enum t3_wr_opcode op, + enum t3_wr_flags flags, uint8_t genbit, + uint32_t tid, uint8_t len) +{ + wqe->op_seop_flags = htobe32(V_FW_RIWR_OP(op) | + V_FW_RIWR_SOPEOP(M_FW_RIWR_SOPEOP) | + V_FW_RIWR_FLAGS(flags)); + udma_to_device_barrier(); + wqe->gen_tid_len = htobe32(V_FW_RIWR_GEN(genbit) | V_FW_RIWR_TID(tid) | + V_FW_RIWR_LEN(len)); + /* 2nd gen bit... */ + ((union t3_wr *)wqe)->flit[15] = htobe64(genbit); +} + +/* + * T3 ULP2_TX commands + */ +enum t3_utx_mem_op { + T3_UTX_MEM_READ = 2, + T3_UTX_MEM_WRITE = 3 +}; + +/* T3 MC7 RDMA TPT entry format */ + +enum tpt_mem_type { + TPT_NON_SHARED_MR = 0x0, + TPT_SHARED_MR = 0x1, + TPT_MW = 0x2, + TPT_MW_RELAXED_PROTECTION = 0x3 +}; + +enum tpt_addr_type { + TPT_ZBTO = 0, + TPT_VATO = 1 +}; + +enum tpt_mem_perm { + TPT_LOCAL_READ = 0x8, + TPT_LOCAL_WRITE = 0x4, + TPT_REMOTE_READ = 0x2, + TPT_REMOTE_WRITE = 0x1 +}; + +struct tpt_entry { + uint32_t valid_stag_pdid; + uint32_t flags_pagesize_qpid; + + uint32_t rsvd_pbl_addr; + uint32_t len; + uint32_t va_hi; + uint32_t va_low_or_fbo; + + uint32_t rsvd_bind_cnt_or_pstag; + uint32_t rsvd_pbl_size; +}; + +#define S_TPT_VALID 31 +#define V_TPT_VALID(x) ((x) << S_TPT_VALID) +#define F_TPT_VALID V_TPT_VALID(1U) + +#define S_TPT_STAG_KEY 23 +#define M_TPT_STAG_KEY 0xFF +#define V_TPT_STAG_KEY(x) ((x) << S_TPT_STAG_KEY) +#define G_TPT_STAG_KEY(x) (((x) >> S_TPT_STAG_KEY) & M_TPT_STAG_KEY) + +#define S_TPT_STAG_STATE 22 +#define V_TPT_STAG_STATE(x) ((x) << S_TPT_STAG_STATE) +#define F_TPT_STAG_STATE V_TPT_STAG_STATE(1U) + +#define S_TPT_STAG_TYPE 20 +#define M_TPT_STAG_TYPE 0x3 +#define V_TPT_STAG_TYPE(x) ((x) << S_TPT_STAG_TYPE) +#define G_TPT_STAG_TYPE(x) (((x) >> S_TPT_STAG_TYPE) & M_TPT_STAG_TYPE) + +#define S_TPT_PDID 0 +#define M_TPT_PDID 0xFFFFF +#define V_TPT_PDID(x) ((x) << S_TPT_PDID) +#define G_TPT_PDID(x) (((x) >> S_TPT_PDID) & M_TPT_PDID) + +#define S_TPT_PERM 28 +#define M_TPT_PERM 0xF +#define V_TPT_PERM(x) ((x) << S_TPT_PERM) +#define G_TPT_PERM(x) (((x) >> S_TPT_PERM) & M_TPT_PERM) + +#define S_TPT_REM_INV_DIS 27 +#define V_TPT_REM_INV_DIS(x) ((x) << S_TPT_REM_INV_DIS) +#define F_TPT_REM_INV_DIS V_TPT_REM_INV_DIS(1U) + +#define S_TPT_ADDR_TYPE 26 +#define V_TPT_ADDR_TYPE(x) ((x) << S_TPT_ADDR_TYPE) +#define F_TPT_ADDR_TYPE V_TPT_ADDR_TYPE(1U) + +#define S_TPT_MW_BIND_ENABLE 25 +#define V_TPT_MW_BIND_ENABLE(x) ((x) << S_TPT_MW_BIND_ENABLE) +#define F_TPT_MW_BIND_ENABLE V_TPT_MW_BIND_ENABLE(1U) + +#define S_TPT_PAGE_SIZE 20 +#define M_TPT_PAGE_SIZE 0x1F +#define V_TPT_PAGE_SIZE(x) ((x) << S_TPT_PAGE_SIZE) +#define G_TPT_PAGE_SIZE(x) (((x) >> S_TPT_PAGE_SIZE) & M_TPT_PAGE_SIZE) + +#define S_TPT_PBL_ADDR 0 +#define M_TPT_PBL_ADDR 0x1FFFFFFF +#define V_TPT_PBL_ADDR(x) ((x) << S_TPT_PBL_ADDR) +#define G_TPT_PBL_ADDR(x) (((x) >> S_TPT_PBL_ADDR) & M_TPT_PBL_ADDR) + +#define S_TPT_QPID 0 +#define M_TPT_QPID 0xFFFFF +#define V_TPT_QPID(x) ((x) << S_TPT_QPID) +#define G_TPT_QPID(x) (((x) >> S_TPT_QPID) & M_TPT_QPID) + +#define S_TPT_PSTAG 0 +#define M_TPT_PSTAG 0xFFFFFF +#define V_TPT_PSTAG(x) ((x) << S_TPT_PSTAG) +#define G_TPT_PSTAG(x) (((x) >> S_TPT_PSTAG) & M_TPT_PSTAG) + +#define S_TPT_PBL_SIZE 0 +#define M_TPT_PBL_SIZE 0xFFFFF +#define V_TPT_PBL_SIZE(x) ((x) << S_TPT_PBL_SIZE) +#define G_TPT_PBL_SIZE(x) (((x) >> S_TPT_PBL_SIZE) & M_TPT_PBL_SIZE) + +/* + * CQE defs + */ +struct t3_cqe { + uint32_t header:32; + uint32_t len:32; + uint32_t wrid_hi_stag:32; + uint32_t wrid_low_msn:32; +}; + +#define S_CQE_OOO 31 +#define M_CQE_OOO 0x1 +#define G_CQE_OOO(x) ((((x) >> S_CQE_OOO)) & M_CQE_OOO) +#define V_CEQ_OOO(x) ((x)<<S_CQE_OOO) + +#define S_CQE_QPID 12 +#define M_CQE_QPID 0x7FFFF +#define G_CQE_QPID(x) ((((x) >> S_CQE_QPID)) & M_CQE_QPID) +#define V_CQE_QPID(x) ((x)<<S_CQE_QPID) + +#define S_CQE_SWCQE 11 +#define M_CQE_SWCQE 0x1 +#define G_CQE_SWCQE(x) ((((x) >> S_CQE_SWCQE)) & M_CQE_SWCQE) +#define V_CQE_SWCQE(x) ((x)<<S_CQE_SWCQE) + +#define S_CQE_GENBIT 10 +#define M_CQE_GENBIT 0x1 +#define G_CQE_GENBIT(x) (((x) >> S_CQE_GENBIT) & M_CQE_GENBIT) +#define V_CQE_GENBIT(x) ((x)<<S_CQE_GENBIT) + +#define S_CQE_STATUS 5 +#define M_CQE_STATUS 0x1F +#define G_CQE_STATUS(x) ((((x) >> S_CQE_STATUS)) & M_CQE_STATUS) +#define V_CQE_STATUS(x) ((x)<<S_CQE_STATUS) + +#define S_CQE_TYPE 4 +#define M_CQE_TYPE 0x1 +#define G_CQE_TYPE(x) ((((x) >> S_CQE_TYPE)) & M_CQE_TYPE) +#define V_CQE_TYPE(x) ((x)<<S_CQE_TYPE) + +#define S_CQE_OPCODE 0 +#define M_CQE_OPCODE 0xF +#define G_CQE_OPCODE(x) ((((x) >> S_CQE_OPCODE)) & M_CQE_OPCODE) +#define V_CQE_OPCODE(x) ((x)<<S_CQE_OPCODE) + +#define SW_CQE(x) (G_CQE_SWCQE(be32toh((x).header))) +#define CQE_OOO(x) (G_CQE_OOO(be32toh((x).header))) +#define CQE_QPID(x) (G_CQE_QPID(be32toh((x).header))) +#define CQE_GENBIT(x) (G_CQE_GENBIT(be32toh((x).header))) +#define CQE_TYPE(x) (G_CQE_TYPE(be32toh((x).header))) +#define SQ_TYPE(x) (CQE_TYPE((x))) +#define RQ_TYPE(x) (!CQE_TYPE((x))) +#define CQE_STATUS(x) (G_CQE_STATUS(be32toh((x).header))) +#define CQE_OPCODE(x) (G_CQE_OPCODE(be32toh((x).header))) + +#define CQE_LEN(x) (be32toh((x).len)) + +#define CQE_WRID_HI(x) (be32toh((x).wrid_hi_stag)) +#define CQE_WRID_LOW(x) (be32toh((x).wrid_low_msn)) + +/* used for RQ completion processing */ +#define CQE_WRID_STAG(x) (be32toh((x).wrid_hi_stag)) +#define CQE_WRID_MSN(x) (be32toh((x).wrid_low_msn)) + +/* used for SQ completion processing */ +#define CQE_WRID_SQ_WPTR(x) ((x).wrid_hi_stag) +#define CQE_WRID_WPTR(x) ((x).wrid_low_msn) + +#define TPT_ERR_SUCCESS 0x0 +#define TPT_ERR_STAG 0x1 /* STAG invalid: either the */ + /* STAG is offlimt, being 0, */ + /* or STAG_key mismatch */ +#define TPT_ERR_PDID 0x2 /* PDID mismatch */ +#define TPT_ERR_QPID 0x3 /* QPID mismatch */ +#define TPT_ERR_ACCESS 0x4 /* Invalid access right */ +#define TPT_ERR_WRAP 0x5 /* Wrap error */ +#define TPT_ERR_BOUND 0x6 /* base and bounds voilation */ +#define TPT_ERR_INVALIDATE_SHARED_MR 0x7 /* attempt to invalidate a */ + /* shared memory region */ +#define TPT_ERR_INVALIDATE_MR_WITH_MW_BOUND 0x8 /* attempt to invalidate a */ + /* shared memory region */ +#define TPT_ERR_ECC 0x9 /* ECC error detected */ +#define TPT_ERR_ECC_PSTAG 0xA /* ECC error detected when */ + /* reading PSTAG for a MW */ + /* Invalidate */ +#define TPT_ERR_PBL_ADDR_BOUND 0xB /* pbl addr out of bounds: */ + /* software error */ +#define TPT_ERR_SWFLUSH 0xC /* SW FLUSHED */ +#define TPT_ERR_CRC 0x10 /* CRC error */ +#define TPT_ERR_MARKER 0x11 /* Marker error */ +#define TPT_ERR_PDU_LEN_ERR 0x12 /* invalid PDU length */ +#define TPT_ERR_OUT_OF_RQE 0x13 /* out of RQE */ +#define TPT_ERR_DDP_VERSION 0x14 /* wrong DDP version */ +#define TPT_ERR_RDMA_VERSION 0x15 /* wrong RDMA version */ +#define TPT_ERR_OPCODE 0x16 /* invalid rdma opcode */ +#define TPT_ERR_DDP_QUEUE_NUM 0x17 /* invalid ddp queue number */ +#define TPT_ERR_MSN 0x18 /* MSN error */ +#define TPT_ERR_TBIT 0x19 /* tag bit not set correctly */ +#define TPT_ERR_MO 0x1A /* MO not 0 for TERMINATE */ + /* or READ_REQ */ +#define TPT_ERR_MSN_GAP 0x1B +#define TPT_ERR_MSN_RANGE 0x1C +#define TPT_ERR_IRD_OVERFLOW 0x1D +#define TPT_ERR_RQE_ADDR_BOUND 0x1E /* RQE addr out of bounds: */ + /* software error */ +#define TPT_ERR_INTERNAL_ERR 0x1F /* internal error (opcode */ + /* mismatch) */ + +struct t3_swsq { + uint64_t wr_id; + struct t3_cqe cqe; + uint32_t sq_wptr; + uint32_t read_len; + int opcode; + int complete; + int signaled; +}; + +/* + * A T3 WQ implements both the SQ and RQ. + */ +struct t3_wq { + union t3_wr *queue; /* DMA Mapped work queue */ + uint32_t error; /* 1 once we go to ERROR */ + uint32_t qpid; + uint32_t wptr; /* idx to next available WR slot */ + uint32_t size_log2; /* total wq size */ + struct t3_swsq *sq; /* SW SQ */ + struct t3_swsq *oldest_read; /* tracks oldest pending read */ + uint32_t sq_wptr; /* sq_wptr - sq_rptr == count of */ + uint32_t sq_rptr; /* pending wrs */ + uint32_t sq_size_log2; /* sq size */ + uint64_t *rq; /* SW RQ (holds consumer wr_ids) */ + uint32_t rq_wptr; /* rq_wptr - rq_rptr == count of */ + uint32_t rq_rptr; /* pending wrs */ + uint32_t rq_size_log2; /* rq size */ + volatile uint32_t *doorbell; /* mapped adapter doorbell register */ + int flushed; +}; + +struct t3_cq { + uint32_t cqid; + uint32_t rptr; + uint32_t wptr; + uint32_t size_log2; + struct t3_cqe *queue; + struct t3_cqe *sw_queue; + uint32_t sw_rptr; + uint32_t sw_wptr; + uint32_t memsize; +}; + +static inline unsigned t3_wq_depth(struct t3_wq *wq) +{ + return (1UL<<wq->size_log2); +} + +static inline unsigned t3_sq_depth(struct t3_wq *wq) +{ + return (1UL<<wq->sq_size_log2); +} + +static inline unsigned t3_rq_depth(struct t3_wq *wq) +{ + return (1UL<<wq->rq_size_log2); +} + +static inline unsigned t3_cq_depth(struct t3_cq *cq) +{ + return (1UL<<cq->size_log2); +} + +extern unsigned long iwch_page_size; +extern unsigned long iwch_page_shift; +extern unsigned long iwch_page_mask; + +#define PAGE_ALIGN(x) (((x) + iwch_page_mask) & ~iwch_page_mask) + +static inline unsigned t3_wq_memsize(struct t3_wq *wq) +{ + return PAGE_ALIGN((1UL<<wq->size_log2) * sizeof (union t3_wr)); +} + +static inline unsigned t3_cq_memsize(struct t3_cq *cq) +{ + return cq->memsize; +} + +static inline unsigned t3_mmid(uint32_t stag) +{ + return (stag>>8); +} + +struct t3_cq_status_page { + uint32_t cq_err; +}; + +static inline int t3_cq_in_error(struct t3_cq *cq) +{ + return ((struct t3_cq_status_page *) + &cq->queue[1 << cq->size_log2])->cq_err; +} + +static inline void t3_set_cq_in_error(struct t3_cq *cq) +{ + ((struct t3_cq_status_page *) + &cq->queue[1 << cq->size_log2])->cq_err = 1; +} + +static inline void t3_reset_cq_in_error(struct t3_cq *cq) +{ + ((struct t3_cq_status_page *) + &cq->queue[1 << cq->size_log2])->cq_err = 0; +} + +static inline int t3_wq_in_error(struct t3_wq *wq) +{ + /* + * The kernel sets bit 0 in the first WR of the WQ memory + * when the QP moves out of RTS... + */ + return (wq->queue->flit[13] & 1); +} + +static inline void t3_set_wq_in_error(struct t3_wq *wq) +{ + wq->queue->flit[13] |= 1; +} + +static inline int t3_wq_db_enabled(struct t3_wq *wq) +{ + return !(wq->queue->flit[13] & 2); +} + +#define CQ_VLD_ENTRY(ptr,size_log2,cqe) (Q_GENBIT(ptr,size_log2) == \ + CQE_GENBIT(*cqe)) + +static inline struct t3_cqe *cxio_next_hw_cqe(struct t3_cq *cq) +{ + struct t3_cqe *cqe; + + cqe = cq->queue + (Q_PTR2IDX(cq->rptr, cq->size_log2)); + if (CQ_VLD_ENTRY(cq->rptr, cq->size_log2, cqe)) + return cqe; + return NULL; +} + +static inline struct t3_cqe *cxio_next_sw_cqe(struct t3_cq *cq) +{ + struct t3_cqe *cqe; + + if (!Q_EMPTY(cq->sw_rptr, cq->sw_wptr)) { + cqe = cq->sw_queue + (Q_PTR2IDX(cq->sw_rptr, cq->size_log2)); + return cqe; + } + return NULL; +} + +static inline struct t3_cqe *cxio_next_cqe(struct t3_cq *cq) +{ + struct t3_cqe *cqe; + + if (!Q_EMPTY(cq->sw_rptr, cq->sw_wptr)) { + cqe = cq->sw_queue + (Q_PTR2IDX(cq->sw_rptr, cq->size_log2)); + return cqe; + } + cqe = cq->queue + (Q_PTR2IDX(cq->rptr, cq->size_log2)); + if (CQ_VLD_ENTRY(cq->rptr, cq->size_log2, cqe)) + return cqe; + return NULL; +} + +/* + * Return a ptr to the next read wr in the SWSQ or NULL. + */ +static inline struct t3_swsq *next_read_wr(struct t3_wq *wq) +{ + uint32_t rptr = wq->oldest_read - wq->sq + 1; + int count = Q_COUNT(rptr, wq->sq_wptr); + struct t3_swsq *sqp; + + while (count--) { + sqp = wq->sq + Q_PTR2IDX(rptr, wq->sq_size_log2); + + if (sqp->opcode == T3_READ_REQ) + return sqp; + + rptr++; + } + return NULL; +} +#endif diff --git providers/cxgb3/firmware_exports.h providers/cxgb3/firmware_exports.h new file mode 100644 index 000000000000..831140a4c8e3 --- /dev/null +++ providers/cxgb3/firmware_exports.h @@ -0,0 +1,148 @@ +/* + * Copyright (c) 2004-2007 Chelsio, Inc. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + */ +#ifndef _FIRMWARE_EXPORTS_H_ +#define _FIRMWARE_EXPORTS_H_ + +/* WR OPCODES supported by the firmware. + */ +#define FW_WROPCODE_FORWARD 0x01 +#define FW_WROPCODE_BYPASS 0x05 + +#define FW_WROPCODE_TUNNEL_TX_PKT 0x03 + +#define FW_WROPOCDE_ULPTX_DATA_SGL 0x00 +#define FW_WROPCODE_ULPTX_MEM_READ 0x02 +#define FW_WROPCODE_ULPTX_PKT 0x04 +#define FW_WROPCODE_ULPTX_INVALIDATE 0x06 + +#define FW_WROPCODE_TUNNEL_RX_PKT 0x07 + +#define FW_WROPCODE_TOE_GETTCB_RPL 0x08 +#define FW_WROPCODE_TOE_CLOSE_CON 0x09 +#define FW_WROPCODE_TOE_TP_ABORT_CON_REQ 0x0A +#define FW_WROPCODE_TOE_HOST_ABORT_CON_RPL 0x0F +#define FW_WROPCODE_TOE_HOST_ABORT_CON_REQ 0x0B +#define FW_WROPCODE_TOE_TP_ABORT_CON_RPL 0x0C +#define FW_WROPCODE_TOE_TX_DATA 0x0D +#define FW_WROPCODE_TOE_TX_DATA_ACK 0x0E + +#define FW_WROPCODE_RI_RDMA_INIT 0x10 +#define FW_WROPCODE_RI_RDMA_WRITE 0x11 +#define FW_WROPCODE_RI_RDMA_READ_REQ 0x12 +#define FW_WROPCODE_RI_RDMA_READ_RESP 0x13 +#define FW_WROPCODE_RI_SEND 0x14 +#define FW_WROPCODE_RI_TERMINATE 0x15 +#define FW_WROPCODE_RI_RDMA_READ 0x16 +#define FW_WROPCODE_RI_RECEIVE 0x17 +#define FW_WROPCODE_RI_BIND_MW 0x18 +#define FW_WROPCODE_RI_FASTREGISTER_MR 0x19 +#define FW_WROPCODE_RI_LOCAL_INV 0x1A +#define FW_WROPCODE_RI_MODIFY_QP 0x1B +#define FW_WROPCODE_RI_BYPASS 0x1C + +#define FW_WROPOCDE_RSVD 0x1E + +#define FW_WROPCODE_SGE_EGRESSCONTEXT_RR 0x1F + +#define FW_WROPCODE_MNGT 0x1D +#define FW_MNGTOPCODE_PKTSCHED_SET 0x00 + +/* Maximum size of a WR sent from the host, limited by the SGE. + * + * Note: WR coming from ULP or TP are only limited by CIM. + */ +#define FW_WR_SIZE 128 + +/* Maximum number of outstanding WRs sent from the host. Value must be + * programmed in the CTRL/TUNNEL/QP SGE Egress Context and used by TOM to + * limit the number of WRs per connection. + */ +#ifndef N3 +# define FW_WR_NUM 16 +#else +# define FW_WR_NUM 7 +#endif + +/* FW_TUNNEL_NUM corresponds to the number of supported TUNNEL Queues. These + * queues must start at SGE Egress Context FW_TUNNEL_SGEEC_START and must + * start at 'TID' (or 'uP Token') FW_TUNNEL_TID_START. + * + * Ingress Traffic (e.g. DMA completion credit) for TUNNEL Queue[i] is sent + * to RESP Queue[i]. + */ +#define FW_TUNNEL_NUM 8 +#define FW_TUNNEL_SGEEC_START 8 +#define FW_TUNNEL_TID_START 65544 + + +/* FW_CTRL_NUM corresponds to the number of supported CTRL Queues. These queues + * must start at SGE Egress Context FW_CTRL_SGEEC_START and must start at 'TID' + * (or 'uP Token') FW_CTRL_TID_START. + * + * Ingress Traffic for CTRL Queue[i] is sent to RESP Queue[i]. + */ +#define FW_CTRL_NUM 8 +#define FW_CTRL_SGEEC_START 65528 +#define FW_CTRL_TID_START 65536 + +/* FW_TOE_NUM corresponds to the number of supported TOE Queues. These queues + * must start at SGE Egress Context FW_TOE_SGEEC_START. + * + * Note: the 'uP Token' in the SGE Egress Context fields is irrelevant for + * TOE Queues, as the host is responsible for providing the correct TID in + * every WR. + * + * Ingress Trafffic for TOE Queue[i] is sent to RESP Queue[i]. + */ +#define FW_TOE_NUM 8 +#define FW_TOE_SGEEC_START 0 + +/* + * + */ +#define FW_RI_NUM 1 +#define FW_RI_SGEEC_START 65527 +#define FW_RI_TID_START 65552 + +/* + * The RX_PKT_TID + */ +#define FW_RX_PKT_NUM 1 +#define FW_RX_PKT_TID_START 65553 + +/* FW_WRC_NUM corresponds to the number of Work Request Context that supported + * by the firmware. + */ +#define FW_WRC_NUM (65536 + FW_TUNNEL_NUM + FW_CTRL_NUM +\ + FW_RI_NUM + FW_RX_PKT_NUM) + +#endif /* _FIRMWARE_EXPORTS_H_ */ diff --git providers/cxgb3/iwch-abi.h providers/cxgb3/iwch-abi.h new file mode 100644 index 000000000000..047f84b7ab63 --- /dev/null +++ providers/cxgb3/iwch-abi.h @@ -0,0 +1,51 @@ +/* + * Copyright (c) 2006-2007 Chelsio, Inc. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + */ +#ifndef IWCH_ABI_H +#define IWCH_ABI_H + +#include <stdint.h> +#include <infiniband/kern-abi.h> +#include <rdma/cxgb3-abi.h> +#include <kernel-abi/cxgb3-abi.h> + +DECLARE_DRV_CMD(uiwch_alloc_pd, IB_USER_VERBS_CMD_ALLOC_PD, + empty, iwch_alloc_pd_resp); +DECLARE_DRV_CMD(uiwch_create_cq, IB_USER_VERBS_CMD_CREATE_CQ, + iwch_create_cq_req, iwch_create_cq_resp); +DECLARE_DRV_CMD(uiwch_create_qp, IB_USER_VERBS_CMD_CREATE_QP, + empty, iwch_create_qp_resp); +DECLARE_DRV_CMD(uiwch_alloc_ucontext, IB_USER_VERBS_CMD_GET_CONTEXT, + empty, empty); +DECLARE_DRV_CMD(uiwch_reg_mr, IB_USER_VERBS_CMD_REG_MR, + empty, iwch_reg_user_mr_resp); + +#endif /* IWCH_ABI_H */ diff --git providers/cxgb3/iwch.c providers/cxgb3/iwch.c new file mode 100644 index 000000000000..6f3c8b9f1439 --- /dev/null +++ providers/cxgb3/iwch.c @@ -0,0 +1,269 @@ +/* + * Copyright (c) 2006-2007 Chelsio, Inc. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + */ +#include <config.h> + +#include <stdio.h> +#include <stdlib.h> +#include <unistd.h> +#include <errno.h> +#include <sys/mman.h> +#include <pthread.h> +#include <string.h> + +#include "iwch.h" +#include "iwch-abi.h" + +#define PCI_VENDOR_ID_CHELSIO 0x1425 +#define PCI_DEVICE_ID_CHELSIO_PE9000_2C 0x0020 +#define PCI_DEVICE_ID_CHELSIO_T302E 0x0021 +#define PCI_DEVICE_ID_CHELSIO_T310E 0x0022 +#define PCI_DEVICE_ID_CHELSIO_T320X 0x0023 +#define PCI_DEVICE_ID_CHELSIO_T302X 0x0024 +#define PCI_DEVICE_ID_CHELSIO_T320E 0x0025 +#define PCI_DEVICE_ID_CHELSIO_T310X 0x0026 +#define PCI_DEVICE_ID_CHELSIO_T3B10 0x0030 +#define PCI_DEVICE_ID_CHELSIO_T3B20 0x0031 +#define PCI_DEVICE_ID_CHELSIO_T3B02 0x0032 +#define PCI_DEVICE_ID_CHELSIO_T3C20 0x0035 +#define PCI_DEVICE_ID_CHELSIO_S320E 0x0036 + +#define HCA(v, d, t) \ + VERBS_PCI_MATCH(PCI_VENDOR_ID_##v, PCI_DEVICE_ID_CHELSIO_##d, \ + (void *)(CHELSIO_##t)) +static const struct verbs_match_ent hca_table[] = { + HCA(CHELSIO, PE9000_2C, T3B), + HCA(CHELSIO, T302E, T3A), + HCA(CHELSIO, T302X, T3A), + HCA(CHELSIO, T310E, T3A), + HCA(CHELSIO, T310X, T3A), + HCA(CHELSIO, T320E, T3A), + HCA(CHELSIO, T320X, T3A), + HCA(CHELSIO, T3B10, T3B), + HCA(CHELSIO, T3B20, T3B), + HCA(CHELSIO, T3B02, T3B), + HCA(CHELSIO, T3C20, T3B), + HCA(CHELSIO, S320E, T3B), + {}, +}; + +static const struct verbs_context_ops iwch_ctx_common_ops = { + .query_device = iwch_query_device, + .query_port = iwch_query_port, + .alloc_pd = iwch_alloc_pd, + .dealloc_pd = iwch_free_pd, + .reg_mr = iwch_reg_mr, + .dereg_mr = iwch_dereg_mr, + .create_cq = iwch_create_cq, + .resize_cq = iwch_resize_cq, + .destroy_cq = iwch_destroy_cq, + .create_srq = iwch_create_srq, + .modify_srq = iwch_modify_srq, + .destroy_srq = iwch_destroy_srq, + .create_qp = iwch_create_qp, + .modify_qp = iwch_modify_qp, + .destroy_qp = iwch_destroy_qp, + .query_qp = iwch_query_qp, + .create_ah = iwch_create_ah, + .destroy_ah = iwch_destroy_ah, + .attach_mcast = iwch_attach_mcast, + .detach_mcast = iwch_detach_mcast, + .post_srq_recv = iwch_post_srq_recv, + .req_notify_cq = iwch_arm_cq, +}; + +static const struct verbs_context_ops iwch_ctx_t3a_ops = { + .poll_cq = t3a_poll_cq, + .post_recv = t3a_post_recv, + .post_send = t3a_post_send, +}; + +static const struct verbs_context_ops iwch_ctx_t3b_ops = { + .async_event = t3b_async_event, + .poll_cq = t3b_poll_cq, + .post_recv = t3b_post_recv, + .post_send = t3b_post_send, +}; + +unsigned long iwch_page_size; +unsigned long iwch_page_shift; +unsigned long iwch_page_mask; + +static struct verbs_context *iwch_alloc_context(struct ibv_device *ibdev, + int cmd_fd, + void *private_data) +{ + struct iwch_context *context; + struct ibv_get_context cmd; + struct uiwch_alloc_ucontext_resp resp; + struct iwch_device *rhp = to_iwch_dev(ibdev); + + context = verbs_init_and_alloc_context(ibdev, cmd_fd, context, ibv_ctx, + RDMA_DRIVER_CXGB3); + if (!context) + return NULL; + + if (ibv_cmd_get_context(&context->ibv_ctx, &cmd, sizeof cmd, + &resp.ibv_resp, sizeof resp)) + goto err_free; + + verbs_set_ops(&context->ibv_ctx, &iwch_ctx_common_ops); + + switch (rhp->hca_type) { + case CHELSIO_T3B: + PDBG("%s T3B device\n", __FUNCTION__); + verbs_set_ops(&context->ibv_ctx, &iwch_ctx_t3b_ops); + break; + case CHELSIO_T3A: + PDBG("%s T3A device\n", __FUNCTION__); + verbs_set_ops(&context->ibv_ctx, &iwch_ctx_t3a_ops); + break; + default: + PDBG("%s unknown hca type %d\n", __FUNCTION__, rhp->hca_type); + goto err_free; + break; + } + + return &context->ibv_ctx; + +err_free: + verbs_uninit_context(&context->ibv_ctx); + free(context); + return NULL; +} + +static void iwch_free_context(struct ibv_context *ibctx) +{ + struct iwch_context *context = to_iwch_ctx(ibctx); + + verbs_uninit_context(&context->ibv_ctx); + free(context); +} + +static void iwch_uninit_device(struct verbs_device *verbs_device) +{ + struct iwch_device *dev = to_iwch_dev(&verbs_device->device); + + free(dev); +} + +static bool iwch_device_match(struct verbs_sysfs_dev *sysfs_dev) +{ + char value[32], *cp; + unsigned int fw_maj, fw_min; + + /* Rely on the core code to match PCI devices */ + if (!sysfs_dev->match) + return false; + + /* + * Verify that the firmware major number matches. Major number + * mismatches are fatal. Minor number mismatches are tolerated. + */ + if (ibv_get_fw_ver(value, sizeof(value), sysfs_dev)) + return false; + + cp = strtok(value+1, "."); + sscanf(cp, "%i", &fw_maj); + cp = strtok(NULL, "."); + sscanf(cp, "%i", &fw_min); + + if (fw_maj < FW_MAJ) { + fprintf(stderr, "libcxgb3: Fatal firmware version mismatch. " + "Firmware major number is %u and libcxgb3 needs %u.\n", + fw_maj, FW_MAJ); + fflush(stderr); + return false; + } + + DBGLOG("libcxgb3"); + + if ((signed int)fw_min < FW_MIN) { + PDBG("libcxgb3: non-fatal firmware version mismatch. " + "Firmware minor number is %u and libcxgb3 needs %u.\n", + fw_min, FW_MIN); + fflush(stderr); + } + + return true; +} + +static struct verbs_device *iwch_device_alloc(struct verbs_sysfs_dev *sysfs_dev) +{ + struct iwch_device *dev; + + dev = calloc(1, sizeof(*dev)); + if (!dev) + return NULL; + + pthread_spin_init(&dev->lock, PTHREAD_PROCESS_PRIVATE); + dev->hca_type = (uintptr_t)sysfs_dev->match->driver_data; + dev->abi_version = sysfs_dev->abi_ver; + + iwch_page_size = sysconf(_SC_PAGESIZE); + iwch_page_shift = long_log2(iwch_page_size); + iwch_page_mask = iwch_page_size - 1; + + dev->mmid2ptr = calloc(T3_MAX_NUM_STAG, sizeof(void *)); + if (!dev->mmid2ptr) { + goto err1; + } + dev->qpid2ptr = calloc(T3_MAX_NUM_QP, sizeof(void *)); + if (!dev->qpid2ptr) { + goto err2; + } + dev->cqid2ptr = calloc(T3_MAX_NUM_CQ, sizeof(void *)); + if (!dev->cqid2ptr) + goto err3; + + return &dev->ibv_dev; + +err3: + free(dev->qpid2ptr); +err2: + free(dev->mmid2ptr); +err1: + free(dev); + return NULL; +} + +static const struct verbs_device_ops iwch_dev_ops = { + .name = "cxgb3", + .match_min_abi_version = 0, + .match_max_abi_version = ABI_VERS, + .match_table = hca_table, + .match_device = iwch_device_match, + .alloc_device = iwch_device_alloc, + .uninit_device = iwch_uninit_device, + .alloc_context = iwch_alloc_context, + .free_context = iwch_free_context, +}; +PROVIDER_DRIVER(cxgb3, iwch_dev_ops); diff --git providers/cxgb3/iwch.h providers/cxgb3/iwch.h new file mode 100644 index 000000000000..c7d85d3aab2e --- /dev/null +++ providers/cxgb3/iwch.h @@ -0,0 +1,218 @@ +/* + * Copyright (c) 2006-2007 Chelsio, Inc. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + */ +#ifndef IWCH_H +#define IWCH_H + +#include <pthread.h> +#include <inttypes.h> +#include <stddef.h> + +#include <infiniband/driver.h> +#include <util/udma_barrier.h> + +#include "cxio_wr.h" + +enum iwch_hca_type { + CHELSIO_T3A = 0, + CHELSIO_T3B = 1, +}; + +struct iwch_mr; + +#define ABI_VERS 1 + +struct iwch_device { + struct verbs_device ibv_dev; + enum iwch_hca_type hca_type; + struct iwch_mr **mmid2ptr; + struct iwch_qp **qpid2ptr; + struct iwch_cq **cqid2ptr; + pthread_spinlock_t lock; + int abi_version; +}; + +static inline int t3b_device(struct iwch_device *dev) +{ + return (dev->hca_type == CHELSIO_T3B); +} + +static inline int t3a_device(struct iwch_device *dev) +{ + return (dev->hca_type == CHELSIO_T3A); +} + +struct iwch_context { + struct verbs_context ibv_ctx; +}; + +struct iwch_pd { + struct ibv_pd ibv_pd; +}; + +struct iwch_mr { + struct verbs_mr vmr; + uint64_t va_fbo; + uint32_t page_size; + uint32_t pbl_addr; + uint32_t len; +}; + +struct iwch_cq { + struct ibv_cq ibv_cq; + struct iwch_device *rhp; + struct t3_cq cq; + pthread_spinlock_t lock; +}; + +struct iwch_qp { + struct ibv_qp ibv_qp; + struct iwch_device *rhp; + struct t3_wq wq; + pthread_spinlock_t lock; + int sq_sig_all; +}; + +#define to_iwch_xxx(xxx, type) \ + container_of(ib##xxx, struct iwch_##type, ibv_##xxx) + +static inline struct iwch_device *to_iwch_dev(struct ibv_device *ibdev) +{ + return container_of(ibdev, struct iwch_device, ibv_dev.device); +} + +static inline struct iwch_context *to_iwch_ctx(struct ibv_context *ibctx) +{ + return container_of(ibctx, struct iwch_context, ibv_ctx.context); +} + +static inline struct iwch_pd *to_iwch_pd(struct ibv_pd *ibpd) +{ + return to_iwch_xxx(pd, pd); +} + +static inline struct iwch_cq *to_iwch_cq(struct ibv_cq *ibcq) +{ + return to_iwch_xxx(cq, cq); +} + +static inline struct iwch_qp *to_iwch_qp(struct ibv_qp *ibqp) +{ + return to_iwch_xxx(qp, qp); +} + +static inline struct iwch_mr *to_iwch_mr(struct verbs_mr *vmr) +{ + return container_of(vmr, struct iwch_mr, vmr); +} + +static inline unsigned long long_log2(unsigned long x) +{ + unsigned long r = 0; + for (x >>= 1; x > 0; x >>= 1) + r++; + return r; +} + +extern int iwch_query_device(struct ibv_context *context, + struct ibv_device_attr *attr); +extern int iwch_query_port(struct ibv_context *context, uint8_t port, + struct ibv_port_attr *attr); + +extern struct ibv_pd *iwch_alloc_pd(struct ibv_context *context); +extern int iwch_free_pd(struct ibv_pd *pd); + +extern struct ibv_mr *iwch_reg_mr(struct ibv_pd *pd, void *addr, size_t length, + uint64_t hca_va, int access); +extern int iwch_dereg_mr(struct verbs_mr *mr); + +struct ibv_cq *iwch_create_cq(struct ibv_context *context, int cqe, + struct ibv_comp_channel *channel, + int comp_vector); +extern int iwch_resize_cq(struct ibv_cq *cq, int cqe); +extern int iwch_destroy_cq(struct ibv_cq *cq); +extern int t3a_poll_cq(struct ibv_cq *cq, int ne, struct ibv_wc *wc); +extern int t3b_poll_cq(struct ibv_cq *cq, int ne, struct ibv_wc *wc); +extern int iwch_arm_cq(struct ibv_cq *cq, int solicited); +extern void iwch_cq_event(struct ibv_cq *cq); +extern void iwch_init_cq_buf(struct iwch_cq *cq, int nent); + +extern struct ibv_srq *iwch_create_srq(struct ibv_pd *pd, + struct ibv_srq_init_attr *attr); +extern int iwch_modify_srq(struct ibv_srq *srq, + struct ibv_srq_attr *attr, + int mask); +extern int iwch_destroy_srq(struct ibv_srq *srq); +extern int iwch_post_srq_recv(struct ibv_srq *ibsrq, + struct ibv_recv_wr *wr, + struct ibv_recv_wr **bad_wr); + +extern struct ibv_qp *iwch_create_qp(struct ibv_pd *pd, + struct ibv_qp_init_attr *attr); +extern int iwch_modify_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, + int attr_mask); +extern int iwch_destroy_qp(struct ibv_qp *qp); +extern int iwch_query_qp(struct ibv_qp *qp, + struct ibv_qp_attr *attr, + int attr_mask, + struct ibv_qp_init_attr *init_attr); +extern void iwch_flush_qp(struct iwch_qp *qhp); +extern void iwch_flush_qps(struct iwch_device *dev); +extern int t3a_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr, + struct ibv_send_wr **bad_wr); +extern int t3b_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr, + struct ibv_send_wr **bad_wr); +extern int t3a_post_recv(struct ibv_qp *ibqp, struct ibv_recv_wr *wr, + struct ibv_recv_wr **bad_wr); +extern int t3b_post_recv(struct ibv_qp *ibqp, struct ibv_recv_wr *wr, + struct ibv_recv_wr **bad_wr); +extern struct ibv_ah *iwch_create_ah(struct ibv_pd *pd, + struct ibv_ah_attr *ah_attr); +extern int iwch_destroy_ah(struct ibv_ah *ah); +extern int iwch_attach_mcast(struct ibv_qp *qp, const union ibv_gid *gid, + uint16_t lid); +extern int iwch_detach_mcast(struct ibv_qp *qp, const union ibv_gid *gid, + uint16_t lid); +extern void t3b_async_event(struct ibv_context *context, + struct ibv_async_event *event); +#ifdef DEBUG +#include <syslog.h> +#define DBGLOG(s) openlog(s, LOG_NDELAY|LOG_PID, LOG_LOCAL7) +#define PDBG(fmt, args...) do {syslog(LOG_DEBUG, fmt, ##args);} while (0) +#else +#define DBGLOG(s) +#define PDBG(fmt, args...) do {} while (0) +#endif + +#define FW_MAJ 5 +#define FW_MIN 0 + +#endif /* IWCH_H */ diff --git providers/cxgb3/qp.c providers/cxgb3/qp.c new file mode 100644 index 000000000000..4a1e7397cc96 --- /dev/null +++ providers/cxgb3/qp.c @@ -0,0 +1,560 @@ +/* + * Copyright (c) 2006-2007 Chelsio, Inc. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + */ +#include <config.h> + +#include <stdlib.h> +#include <pthread.h> +#include <string.h> + +#include "iwch.h" +#include <stdio.h> + +#define ROUNDUP8(a) (((a) + 7) & ~7) + +static inline int iwch_build_rdma_send(union t3_wr *wqe, struct ibv_send_wr *wr, + uint8_t *flit_cnt) +{ + int i; + + if (wr->num_sge > T3_MAX_SGE) + return -1; + if (wr->send_flags & IBV_SEND_SOLICITED) + wqe->send.rdmaop = T3_SEND_WITH_SE; + else + wqe->send.rdmaop = T3_SEND; + wqe->send.rem_stag = 0; + wqe->send.reserved = 0; + if ((wr->send_flags & IBV_SEND_INLINE) || wr->num_sge == 0) { + uint8_t *datap; + + wqe->send.plen = 0; + datap = (uint8_t *)&wqe->send.sgl[0]; + wqe->send.num_sgle = 0; /* indicates in-line data */ + for (i = 0; i < wr->num_sge; i++) { + if ((wqe->send.plen + wr->sg_list[i].length) > + T3_MAX_INLINE) + return -1; + wqe->send.plen += wr->sg_list[i].length; + memcpy(datap, + (void *)(unsigned long)wr->sg_list[i].addr, + wr->sg_list[i].length); + datap += wr->sg_list[i].length; + } + *flit_cnt = 4 + (ROUNDUP8(wqe->send.plen) >> 3); + wqe->send.plen = htobe32(wqe->send.plen); + } else { + wqe->send.plen = 0; + for (i = 0; i < wr->num_sge; i++) { + if ((wqe->send.plen + wr->sg_list[i].length) < + wqe->send.plen) { + return -1; + } + wqe->send.plen += wr->sg_list[i].length; + wqe->send.sgl[i].stag = + htobe32(wr->sg_list[i].lkey); + wqe->send.sgl[i].len = + htobe32(wr->sg_list[i].length); + wqe->send.sgl[i].to = htobe64(wr->sg_list[i].addr); + } + wqe->send.plen = htobe32(wqe->send.plen); + wqe->send.num_sgle = htobe32(wr->num_sge); + *flit_cnt = 4 + ((wr->num_sge) << 1); + } + return 0; +} + +static inline int iwch_build_rdma_write(union t3_wr *wqe, + struct ibv_send_wr *wr, + uint8_t *flit_cnt) +{ + int i; + + if (wr->num_sge > T3_MAX_SGE) + return -1; + wqe->write.rdmaop = T3_RDMA_WRITE; + wqe->write.reserved = 0; + wqe->write.stag_sink = htobe32(wr->wr.rdma.rkey); + wqe->write.to_sink = htobe64(wr->wr.rdma.remote_addr); + + wqe->write.num_sgle = wr->num_sge; + + if ((wr->send_flags & IBV_SEND_INLINE) || wr->num_sge == 0) { + uint8_t *datap; + + wqe->write.plen = 0; + datap = (uint8_t *)&wqe->write.sgl[0]; + wqe->write.num_sgle = 0; /* indicates in-line data */ + for (i = 0; i < wr->num_sge; i++) { + if ((wqe->write.plen + wr->sg_list[i].length) > + T3_MAX_INLINE) + return -1; + wqe->write.plen += wr->sg_list[i].length; + memcpy(datap, + (void *)(unsigned long)wr->sg_list[i].addr, + wr->sg_list[i].length); + datap += wr->sg_list[i].length; + } + *flit_cnt = 5 + (ROUNDUP8(wqe->write.plen) >> 3); + wqe->write.plen = htobe32(wqe->write.plen); + } else { + wqe->write.plen = 0; + for (i = 0; i < wr->num_sge; i++) { + if ((wqe->write.plen + wr->sg_list[i].length) < + wqe->write.plen) { + return -1; + } + wqe->write.plen += wr->sg_list[i].length; + wqe->write.sgl[i].stag = + htobe32(wr->sg_list[i].lkey); + wqe->write.sgl[i].len = + htobe32(wr->sg_list[i].length); + wqe->write.sgl[i].to = + htobe64(wr->sg_list[i].addr); + } + wqe->write.plen = htobe32(wqe->write.plen); + wqe->write.num_sgle = htobe32(wr->num_sge); + *flit_cnt = 5 + ((wr->num_sge) << 1); + } + return 0; +} + +static inline int iwch_build_rdma_read(union t3_wr *wqe, struct ibv_send_wr *wr, + uint8_t *flit_cnt) +{ + if (wr->num_sge > 1) + return -1; + wqe->read.rdmaop = T3_READ_REQ; + wqe->read.reserved = 0; + if (wr->num_sge == 1 && wr->sg_list[0].length > 0) { + wqe->read.rem_stag = htobe32(wr->wr.rdma.rkey); + wqe->read.rem_to = htobe64(wr->wr.rdma.remote_addr); + wqe->read.local_stag = htobe32(wr->sg_list[0].lkey); + wqe->read.local_len = htobe32(wr->sg_list[0].length); + wqe->read.local_to = htobe64(wr->sg_list[0].addr); + } else { + + /* build passable 0B read request */ + wqe->read.rem_stag = 2; + wqe->read.rem_to = 2; + wqe->read.local_stag = 2; + wqe->read.local_len = 0; + wqe->read.local_to = 2; + } + *flit_cnt = sizeof(struct t3_rdma_read_wr) >> 3; + return 0; +} + +int t3b_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr, + struct ibv_send_wr **bad_wr) +{ + int err = 0; + uint8_t t3_wr_flit_cnt; + enum t3_wr_opcode t3_wr_opcode = 0; + enum t3_wr_flags t3_wr_flags; + struct iwch_qp *qhp; + uint32_t idx; + union t3_wr *wqe; + uint32_t num_wrs; + struct t3_swsq *sqp; + + qhp = to_iwch_qp(ibqp); + pthread_spin_lock(&qhp->lock); + if (t3_wq_in_error(&qhp->wq)) { + iwch_flush_qp(qhp); + pthread_spin_unlock(&qhp->lock); + return -1; + } + num_wrs = Q_FREECNT(qhp->wq.sq_rptr, qhp->wq.sq_wptr, + qhp->wq.sq_size_log2); + if (num_wrs <= 0) { + pthread_spin_unlock(&qhp->lock); + return -1; + } + while (wr) { + if (num_wrs == 0) { + err = -1; + *bad_wr = wr; + break; + } + idx = Q_PTR2IDX(qhp->wq.wptr, qhp->wq.size_log2); + wqe = (union t3_wr *) (qhp->wq.queue + idx); + t3_wr_flags = 0; + if (wr->send_flags & IBV_SEND_SOLICITED) + t3_wr_flags |= T3_SOLICITED_EVENT_FLAG; + if (wr->send_flags & IBV_SEND_FENCE) + t3_wr_flags |= T3_READ_FENCE_FLAG; + if ((wr->send_flags & IBV_SEND_SIGNALED) || qhp->sq_sig_all) + t3_wr_flags |= T3_COMPLETION_FLAG; + sqp = qhp->wq.sq + + Q_PTR2IDX(qhp->wq.sq_wptr, qhp->wq.sq_size_log2); + switch (wr->opcode) { + case IBV_WR_SEND: + t3_wr_opcode = T3_WR_SEND; + err = iwch_build_rdma_send(wqe, wr, &t3_wr_flit_cnt); + break; + case IBV_WR_RDMA_WRITE: + t3_wr_opcode = T3_WR_WRITE; + err = iwch_build_rdma_write(wqe, wr, &t3_wr_flit_cnt); + break; + case IBV_WR_RDMA_READ: + t3_wr_opcode = T3_WR_READ; + t3_wr_flags = 0; + err = iwch_build_rdma_read(wqe, wr, &t3_wr_flit_cnt); + if (err) + break; + sqp->read_len = wqe->read.local_len; + if (!qhp->wq.oldest_read) + qhp->wq.oldest_read = sqp; + break; + default: + PDBG("%s post of type=%d TBD!\n", __FUNCTION__, + wr->opcode); + err = -1; + } + if (err) { + *bad_wr = wr; + break; + } + wqe->send.wrid.id0.hi = qhp->wq.sq_wptr; + sqp->wr_id = wr->wr_id; + sqp->opcode = wr2opcode(t3_wr_opcode); + sqp->sq_wptr = qhp->wq.sq_wptr; + sqp->complete = 0; + sqp->signaled = (wr->send_flags & IBV_SEND_SIGNALED); + + build_fw_riwrh((void *) wqe, t3_wr_opcode, t3_wr_flags, + Q_GENBIT(qhp->wq.wptr, qhp->wq.size_log2), + 0, t3_wr_flit_cnt); + PDBG("%s cookie 0x%" PRIx64 + " wq idx 0x%x swsq idx %ld opcode %d\n", + __FUNCTION__, wr->wr_id, idx, + Q_PTR2IDX(qhp->wq.sq_wptr, qhp->wq.sq_size_log2), + sqp->opcode); + wr = wr->next; + num_wrs--; + ++(qhp->wq.wptr); + ++(qhp->wq.sq_wptr); + } + pthread_spin_unlock(&qhp->lock); + if (t3_wq_db_enabled(&qhp->wq)) + RING_DOORBELL(qhp->wq.doorbell, qhp->wq.qpid); + return err; +} + +int t3a_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr, + struct ibv_send_wr **bad_wr) +{ + int ret; + struct iwch_qp *qhp = to_iwch_qp(ibqp); + + pthread_spin_lock(&qhp->lock); + ret = ibv_cmd_post_send(ibqp, wr, bad_wr); + pthread_spin_unlock(&qhp->lock); + return ret; +} + +static inline int iwch_build_rdma_recv(struct iwch_device *rhp, + union t3_wr *wqe, + struct ibv_recv_wr *wr) +{ + int i; + if (wr->num_sge > T3_MAX_SGE) + return -1; + + wqe->recv.num_sgle = htobe32(wr->num_sge); + for (i = 0; i < wr->num_sge; i++) { + wqe->recv.sgl[i].stag = htobe32(wr->sg_list[i].lkey); + wqe->recv.sgl[i].len = htobe32(wr->sg_list[i].length); + wqe->recv.sgl[i].to = htobe64(wr->sg_list[i].addr); + } + for (; i < T3_MAX_SGE; i++) { + wqe->recv.sgl[i].stag = 0; + wqe->recv.sgl[i].len = 0; + wqe->recv.sgl[i].to = 0; + } + return 0; +} + +static void insert_recv_cqe(struct t3_wq *wq, struct t3_cq *cq) +{ + struct t3_cqe cqe; + + PDBG("%s wq %p cq %p sw_rptr 0x%x sw_wptr 0x%x\n", __FUNCTION__, + wq, cq, cq->sw_rptr, cq->sw_wptr); + memset(&cqe, 0, sizeof(cqe)); + cqe.header = V_CQE_STATUS(TPT_ERR_SWFLUSH) | + V_CQE_OPCODE(T3_SEND) | + V_CQE_TYPE(0) | + V_CQE_SWCQE(1) | + V_CQE_QPID(wq->qpid) | + V_CQE_GENBIT(Q_GENBIT(cq->sw_wptr, cq->size_log2)); + cqe.header = htobe32(cqe.header); + *(cq->sw_queue + Q_PTR2IDX(cq->sw_wptr, cq->size_log2)) = cqe; + cq->sw_wptr++; +} + +static void flush_rq(struct t3_wq *wq, struct t3_cq *cq, int count) +{ + uint32_t ptr; + + /* flush RQ */ + PDBG("%s rq_rptr 0x%x rq_wptr 0x%x skip count %u\n", __FUNCTION__, + wq->rq_rptr, wq->rq_wptr, count); + ptr = wq->rq_rptr + count; + while (ptr++ != wq->rq_wptr) { + insert_recv_cqe(wq, cq); + } +} + +static void insert_sq_cqe(struct t3_wq *wq, struct t3_cq *cq, + struct t3_swsq *sqp) +{ + struct t3_cqe cqe; + + PDBG("%s wq %p cq %p sw_rptr 0x%x sw_wptr 0x%x\n", __FUNCTION__, + wq, cq, cq->sw_rptr, cq->sw_wptr); + memset(&cqe, 0, sizeof(cqe)); + cqe.header = V_CQE_STATUS(TPT_ERR_SWFLUSH) | + V_CQE_OPCODE(sqp->opcode) | + V_CQE_TYPE(1) | + V_CQE_SWCQE(1) | + V_CQE_QPID(wq->qpid) | + V_CQE_GENBIT(Q_GENBIT(cq->sw_wptr, cq->size_log2)); + cqe.header = htobe32(cqe.header); + CQE_WRID_SQ_WPTR(cqe) = sqp->sq_wptr; + + *(cq->sw_queue + Q_PTR2IDX(cq->sw_wptr, cq->size_log2)) = cqe; + cq->sw_wptr++; +} + +static void flush_sq(struct t3_wq *wq, struct t3_cq *cq, int count) +{ + uint32_t ptr; + struct t3_swsq *sqp; + + ptr = wq->sq_rptr + count; + sqp = wq->sq + Q_PTR2IDX(ptr, wq->sq_size_log2); + while (ptr != wq->sq_wptr) { + insert_sq_cqe(wq, cq, sqp); + ptr++; + sqp = wq->sq + Q_PTR2IDX(ptr, wq->sq_size_log2); + } +} + +/* + * Move all CQEs from the HWCQ into the SWCQ. + */ +static void flush_hw_cq(struct t3_cq *cq) +{ + struct t3_cqe *cqe, *swcqe; + + PDBG("%s cq %p cqid 0x%x\n", __FUNCTION__, cq, cq->cqid); + cqe = cxio_next_hw_cqe(cq); + while (cqe) { + PDBG("%s flushing hwcq rptr 0x%x to swcq wptr 0x%x\n", + __FUNCTION__, cq->rptr, cq->sw_wptr); + swcqe = cq->sw_queue + Q_PTR2IDX(cq->sw_wptr, cq->size_log2); + *swcqe = *cqe; + swcqe->header |= htobe32(V_CQE_SWCQE(1)); + cq->sw_wptr++; + cq->rptr++; + cqe = cxio_next_hw_cqe(cq); + } +} + +static void count_scqes(struct t3_cq *cq, struct t3_wq *wq, int *count) +{ + struct t3_cqe *cqe; + uint32_t ptr; + + *count = 0; + ptr = cq->sw_rptr; + while (!Q_EMPTY(ptr, cq->sw_wptr)) { + cqe = cq->sw_queue + (Q_PTR2IDX(ptr, cq->size_log2)); + if ((SQ_TYPE(*cqe) || + (CQE_OPCODE(*cqe) == T3_READ_RESP && CQE_WRID_STAG(*cqe) != 1)) && + (CQE_QPID(*cqe) == wq->qpid)) + (*count)++; + ptr++; + } + PDBG("%s cq %p count %d\n", __FUNCTION__, cq, *count); +} + +static void count_rcqes(struct t3_cq *cq, struct t3_wq *wq, int *count) +{ + struct t3_cqe *cqe; + uint32_t ptr; + + *count = 0; + ptr = cq->sw_rptr; + while (!Q_EMPTY(ptr, cq->sw_wptr)) { + cqe = cq->sw_queue + (Q_PTR2IDX(ptr, cq->size_log2)); + if (RQ_TYPE(*cqe) && (CQE_OPCODE(*cqe) != T3_READ_RESP) && + (CQE_QPID(*cqe) == wq->qpid)) + (*count)++; + ptr++; + } + PDBG("%s cq %p count %d\n", __FUNCTION__, cq, *count); +} + +/* + * Assumes qhp lock is held. + */ +void iwch_flush_qp(struct iwch_qp *qhp) +{ + struct iwch_cq *rchp, *schp; + int count; + + if (qhp->wq.flushed) + return; + + rchp = qhp->rhp->cqid2ptr[to_iwch_cq(qhp->ibv_qp.recv_cq)->cq.cqid]; + schp = qhp->rhp->cqid2ptr[to_iwch_cq(qhp->ibv_qp.send_cq)->cq.cqid]; + + PDBG("%s qhp %p rchp %p schp %p\n", __FUNCTION__, qhp, rchp, schp); + qhp->wq.flushed = 1; + +#ifdef notyet + /* take a ref on the qhp since we must release the lock */ + atomic_inc(&qhp->refcnt); +#endif + pthread_spin_unlock(&qhp->lock); + + /* locking heirarchy: cq lock first, then qp lock. */ + pthread_spin_lock(&rchp->lock); + pthread_spin_lock(&qhp->lock); + flush_hw_cq(&rchp->cq); + count_rcqes(&rchp->cq, &qhp->wq, &count); + flush_rq(&qhp->wq, &rchp->cq, count); + pthread_spin_unlock(&qhp->lock); + pthread_spin_unlock(&rchp->lock); + + /* locking heirarchy: cq lock first, then qp lock. */ + pthread_spin_lock(&schp->lock); + pthread_spin_lock(&qhp->lock); + flush_hw_cq(&schp->cq); + count_scqes(&schp->cq, &qhp->wq, &count); + flush_sq(&qhp->wq, &schp->cq, count); + pthread_spin_unlock(&qhp->lock); + pthread_spin_unlock(&schp->lock); + +#ifdef notyet + /* deref */ + if (atomic_dec_and_test(&qhp->refcnt)) + wake_up(&qhp->wait); +#endif + pthread_spin_lock(&qhp->lock); +} + +void iwch_flush_qps(struct iwch_device *dev) +{ + int i; + + pthread_spin_lock(&dev->lock); + for (i=0; i < T3_MAX_NUM_QP; i++) { + struct iwch_qp *qhp = dev->qpid2ptr[i]; + if (qhp) { + if (!qhp->wq.flushed && t3_wq_in_error(&qhp->wq)) { + pthread_spin_lock(&qhp->lock); + iwch_flush_qp(qhp); + pthread_spin_unlock(&qhp->lock); + } + } + } + pthread_spin_unlock(&dev->lock); + +} + +int t3b_post_recv(struct ibv_qp *ibqp, struct ibv_recv_wr *wr, + struct ibv_recv_wr **bad_wr) +{ + int err = 0; + struct iwch_qp *qhp; + uint32_t idx; + union t3_wr *wqe; + uint32_t num_wrs; + + qhp = to_iwch_qp(ibqp); + pthread_spin_lock(&qhp->lock); + if (t3_wq_in_error(&qhp->wq)) { + iwch_flush_qp(qhp); + pthread_spin_unlock(&qhp->lock); + return -1; + } + num_wrs = Q_FREECNT(qhp->wq.rq_rptr, qhp->wq.rq_wptr, + qhp->wq.rq_size_log2) - 1; + if (!wr) { + pthread_spin_unlock(&qhp->lock); + return -1; + } + while (wr) { + idx = Q_PTR2IDX(qhp->wq.wptr, qhp->wq.size_log2); + wqe = (union t3_wr *) (qhp->wq.queue + idx); + if (num_wrs) + err = iwch_build_rdma_recv(qhp->rhp, wqe, wr); + else + err = -1; + if (err) { + *bad_wr = wr; + break; + } + qhp->wq.rq[Q_PTR2IDX(qhp->wq.rq_wptr, qhp->wq.rq_size_log2)] = + wr->wr_id; + build_fw_riwrh((void *) wqe, T3_WR_RCV, T3_COMPLETION_FLAG, + Q_GENBIT(qhp->wq.wptr, qhp->wq.size_log2), + 0, sizeof(struct t3_receive_wr) >> 3); + PDBG("%s cookie 0x%" PRIx64 + " idx 0x%x rq_wptr 0x%x rw_rptr 0x%x " + "wqe %p \n", __FUNCTION__, wr->wr_id, idx, + qhp->wq.rq_wptr, qhp->wq.rq_rptr, wqe); + ++(qhp->wq.rq_wptr); + ++(qhp->wq.wptr); + wr = wr->next; + num_wrs--; + } + pthread_spin_unlock(&qhp->lock); + if (t3_wq_db_enabled(&qhp->wq)) + RING_DOORBELL(qhp->wq.doorbell, qhp->wq.qpid); + return err; +} + +int t3a_post_recv(struct ibv_qp *ibqp, struct ibv_recv_wr *wr, + struct ibv_recv_wr **bad_wr) +{ + int ret; + struct iwch_qp *qhp = to_iwch_qp(ibqp); + + pthread_spin_lock(&qhp->lock); + ret = ibv_cmd_post_recv(ibqp, wr, bad_wr); + pthread_spin_unlock(&qhp->lock); + return ret; +} diff --git providers/cxgb3/verbs.c providers/cxgb3/verbs.c new file mode 100644 index 000000000000..39a44192e29c --- /dev/null +++ providers/cxgb3/verbs.c @@ -0,0 +1,476 @@ +/* + * Copyright (c) 2006-2007 Chelsio, Inc. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + */ +#include <config.h> + +#include <stdlib.h> +#include <stdio.h> +#include <string.h> +#include <errno.h> +#include <pthread.h> +#include <sys/mman.h> +#include <inttypes.h> + +#include "iwch.h" +#include "iwch-abi.h" + +int iwch_query_device(struct ibv_context *context, struct ibv_device_attr *attr) +{ + struct ibv_query_device cmd; + uint64_t raw_fw_ver; + unsigned major, minor, sub_minor; + int ret; + + ret = ibv_cmd_query_device(context, attr, &raw_fw_ver, &cmd, + sizeof cmd); + if (ret) + return ret; + + major = (raw_fw_ver >> 32) & 0xffff; + minor = (raw_fw_ver >> 16) & 0xffff; + sub_minor = raw_fw_ver & 0xffff; + + snprintf(attr->fw_ver, sizeof attr->fw_ver, + "%d.%d.%d", major, minor, sub_minor); + + return 0; +} + +int iwch_query_port(struct ibv_context *context, uint8_t port, + struct ibv_port_attr *attr) +{ + struct ibv_query_port cmd; + + return ibv_cmd_query_port(context, port, attr, &cmd, sizeof cmd); +} + +struct ibv_pd *iwch_alloc_pd(struct ibv_context *context) +{ + struct ibv_alloc_pd cmd; + struct uiwch_alloc_pd_resp resp; + struct iwch_pd *pd; + + pd = malloc(sizeof *pd); + if (!pd) + return NULL; + + if (ibv_cmd_alloc_pd(context, &pd->ibv_pd, &cmd, sizeof cmd, + &resp.ibv_resp, sizeof resp)) { + free(pd); + return NULL; + } + + return &pd->ibv_pd; +} + +int iwch_free_pd(struct ibv_pd *pd) +{ + int ret; + + ret = ibv_cmd_dealloc_pd(pd); + if (ret) + return ret; + + free(pd); + return 0; +} + +struct ibv_mr *iwch_reg_mr(struct ibv_pd *pd, void *addr, size_t length, + uint64_t hca_va, int access) +{ + struct iwch_mr *mhp; + struct ibv_reg_mr cmd; + struct uiwch_reg_mr_resp resp; + struct iwch_device *dev = to_iwch_dev(pd->context->device); + + PDBG("%s addr %p length %ld hca_va %p\n", __func__, addr, length, + hca_va); + + mhp = malloc(sizeof *mhp); + if (!mhp) + return NULL; + + if (ibv_cmd_reg_mr(pd, addr, length, hca_va, + access, &mhp->vmr, &cmd, sizeof(cmd), + &resp.ibv_resp, sizeof resp)) { + free(mhp); + return NULL; + } + + mhp->va_fbo = hca_va; + mhp->page_size = iwch_page_shift - 12; + mhp->pbl_addr = resp.pbl_addr; + mhp->len = length; + + PDBG("%s stag 0x%x va_fbo 0x%" PRIx64 + " page_size %d pbl_addr 0x%x len %d\n", + __func__, mhp->vmr.ibv_mr.rkey, mhp->va_fbo, + mhp->page_size, mhp->pbl_addr, mhp->len); + + pthread_spin_lock(&dev->lock); + dev->mmid2ptr[t3_mmid(mhp->vmr.ibv_mr.lkey)] = mhp; + pthread_spin_unlock(&dev->lock); + + return &mhp->vmr.ibv_mr; +} + +int iwch_dereg_mr(struct verbs_mr *vmr) +{ + int ret; + struct iwch_device *dev = to_iwch_dev(vmr->ibv_mr.pd->context->device); + + ret = ibv_cmd_dereg_mr(vmr); + if (ret) + return ret; + + pthread_spin_lock(&dev->lock); + dev->mmid2ptr[t3_mmid(vmr->ibv_mr.lkey)] = NULL; + pthread_spin_unlock(&dev->lock); + + free(to_iwch_mr(vmr)); + + return 0; +} + +struct ibv_cq *iwch_create_cq(struct ibv_context *context, int cqe, + struct ibv_comp_channel *channel, int comp_vector) +{ + struct uiwch_create_cq cmd; + struct uiwch_create_cq_resp resp; + struct iwch_cq *chp; + struct iwch_device *dev = to_iwch_dev(context->device); + int ret; + + chp = calloc(1, sizeof *chp); + if (!chp) { + return NULL; + } + + cmd.user_rptr_addr = (uint64_t)(unsigned long)&chp->cq.rptr; + ret = ibv_cmd_create_cq(context, cqe, channel, comp_vector, + &chp->ibv_cq, &cmd.ibv_cmd, sizeof cmd, + &resp.ibv_resp, sizeof resp); + if (ret) + goto err1; + + pthread_spin_init(&chp->lock, PTHREAD_PROCESS_PRIVATE); + chp->rhp = dev; + chp->cq.cqid = resp.cqid; + chp->cq.size_log2 = resp.size_log2; + if (dev->abi_version == 0) + chp->cq.memsize = PAGE_ALIGN((1UL << chp->cq.size_log2) * + sizeof(struct t3_cqe)); + else + chp->cq.memsize = resp.memsize; + chp->cq.queue = mmap(NULL, t3_cq_memsize(&chp->cq), + PROT_READ|PROT_WRITE, MAP_SHARED, context->cmd_fd, + resp.key); + if (chp->cq.queue == MAP_FAILED) + goto err2; + + chp->cq.sw_queue = calloc(t3_cq_depth(&chp->cq), sizeof(struct t3_cqe)); + if (!chp->cq.sw_queue) + goto err3; + + PDBG("%s cqid 0x%x physaddr %" PRIx64 " va %p memsize %d\n", + __FUNCTION__, chp->cq.cqid, resp.physaddr, chp->cq.queue, + t3_cq_memsize(&chp->cq)); + + pthread_spin_lock(&dev->lock); + dev->cqid2ptr[chp->cq.cqid] = chp; + pthread_spin_unlock(&dev->lock); + + return &chp->ibv_cq; +err3: + munmap(chp->cq.queue, t3_cq_memsize(&chp->cq)); +err2: + (void)ibv_cmd_destroy_cq(&chp->ibv_cq); +err1: + free(chp); + return NULL; +} + +int iwch_resize_cq(struct ibv_cq *ibcq, int cqe) +{ +#ifdef notyet + int ret; + struct ibv_resize_cq cmd; + struct iwch_cq *chp = to_iwch_cq(ibcq); + + pthread_spin_lock(&chp->lock); + ret = ibv_cmd_resize_cq(ibcq, cqe, &cmd, sizeof cmd); + /* remap and realloc swcq here */ + pthread_spin_unlock(&chp->lock); + return ret; +#else + return -ENOSYS; +#endif +} + +int iwch_destroy_cq(struct ibv_cq *ibcq) +{ + int ret; + struct iwch_cq *chp = to_iwch_cq(ibcq); + void *cqva = chp->cq.queue; + unsigned size = t3_cq_memsize(&chp->cq); + struct iwch_device *dev = to_iwch_dev(ibcq->context->device); + + munmap(cqva, size); + ret = ibv_cmd_destroy_cq(ibcq); + if (ret) { + return ret; + } + + pthread_spin_lock(&dev->lock); + dev->cqid2ptr[chp->cq.cqid] = NULL; + pthread_spin_unlock(&dev->lock); + + free(chp->cq.sw_queue); + free(chp); + return 0; +} + +struct ibv_srq *iwch_create_srq(struct ibv_pd *pd, + struct ibv_srq_init_attr *attr) +{ + return NULL; +} + +int iwch_modify_srq(struct ibv_srq *srq, struct ibv_srq_attr *attr, + int attr_mask) +{ + return -ENOSYS; +} + +int iwch_destroy_srq(struct ibv_srq *srq) +{ + return -ENOSYS; +} + +int iwch_post_srq_recv(struct ibv_srq *ibsrq, struct ibv_recv_wr *wr, + struct ibv_recv_wr **bad_wr) +{ + return -ENOSYS; +} + +struct ibv_qp *iwch_create_qp(struct ibv_pd *pd, struct ibv_qp_init_attr *attr) +{ + struct uiwch_create_qp cmd; + struct uiwch_create_qp_resp resp; + struct iwch_qp *qhp; + struct iwch_device *dev = to_iwch_dev(pd->context->device); + int ret; + void *dbva; + + PDBG("%s enter qp\n", __FUNCTION__); + qhp = calloc(1, sizeof *qhp); + if (!qhp) + goto err1; + + ret = ibv_cmd_create_qp(pd, &qhp->ibv_qp, attr, &cmd.ibv_cmd, + sizeof cmd, &resp.ibv_resp, sizeof resp); + if (ret) + goto err2; + + PDBG("%s qpid 0x%x physaddr %" PRIx64 " doorbell %" PRIx64 + " size %d sq_size %d rq_size %d\n", + __FUNCTION__, resp.qpid, resp.physaddr, resp.doorbell, + 1 << resp.size_log2, 1 << resp.sq_size_log2, + 1 << resp.rq_size_log2); + + qhp->rhp = dev; + qhp->wq.qpid = resp.qpid; + qhp->wq.size_log2 = resp.size_log2; + qhp->wq.sq_size_log2 = resp.sq_size_log2; + qhp->wq.rq_size_log2 = resp.rq_size_log2; + pthread_spin_init(&qhp->lock, PTHREAD_PROCESS_PRIVATE); + dbva = mmap(NULL, iwch_page_size, PROT_WRITE, MAP_SHARED, + pd->context->cmd_fd, resp.db_key & ~(iwch_page_mask)); + if (dbva == MAP_FAILED) + goto err3; + + qhp->wq.doorbell = dbva + (resp.db_key & (iwch_page_mask)); + qhp->wq.queue = mmap(NULL, t3_wq_memsize(&qhp->wq), + PROT_READ|PROT_WRITE, MAP_SHARED, + pd->context->cmd_fd, resp.key); + if (qhp->wq.queue == MAP_FAILED) + goto err4; + + qhp->wq.rq = calloc(t3_rq_depth(&qhp->wq), sizeof (uint64_t)); + if (!qhp->wq.rq) + goto err5; + + qhp->wq.sq = calloc(t3_sq_depth(&qhp->wq), sizeof (struct t3_swsq)); + if (!qhp->wq.sq) + goto err6; + + PDBG("%s dbva %p wqva %p wq memsize %d\n", __FUNCTION__, + qhp->wq.doorbell, qhp->wq.queue, t3_wq_memsize(&qhp->wq)); + + qhp->sq_sig_all = attr->sq_sig_all; + + pthread_spin_lock(&dev->lock); + dev->qpid2ptr[qhp->wq.qpid] = qhp; + pthread_spin_unlock(&dev->lock); + + return &qhp->ibv_qp; +err6: + free(qhp->wq.rq); +err5: + munmap((void *)qhp->wq.queue, t3_wq_memsize(&qhp->wq)); +err4: + munmap((void *)dbva, iwch_page_size); +err3: + (void)ibv_cmd_destroy_qp(&qhp->ibv_qp); +err2: + free(qhp); +err1: + return NULL; +} + +static void reset_qp(struct iwch_qp *qhp) +{ + PDBG("%s enter qp %p\n", __FUNCTION__, qhp); + qhp->wq.wptr = 0; + qhp->wq.rq_wptr = qhp->wq.rq_rptr = 0; + qhp->wq.sq_wptr = qhp->wq.sq_rptr = 0; + qhp->wq.error = 0; + qhp->wq.oldest_read = NULL; + memset(qhp->wq.queue, 0, t3_wq_memsize(&qhp->wq)); +} + +int iwch_modify_qp(struct ibv_qp *ibqp, struct ibv_qp_attr *attr, + int attr_mask) +{ + struct ibv_modify_qp cmd = {}; + struct iwch_qp *qhp = to_iwch_qp(ibqp); + int ret; + + PDBG("%s enter qp %p new state %d\n", __FUNCTION__, ibqp, attr_mask & IBV_QP_STATE ? attr->qp_state : -1); + pthread_spin_lock(&qhp->lock); + if (t3b_device(qhp->rhp) && t3_wq_in_error(&qhp->wq)) + iwch_flush_qp(qhp); + ret = ibv_cmd_modify_qp(ibqp, attr, attr_mask, &cmd, sizeof cmd); + if (!ret && (attr_mask & IBV_QP_STATE) && attr->qp_state == IBV_QPS_RESET) + reset_qp(qhp); + pthread_spin_unlock(&qhp->lock); + return ret; +} + +int iwch_destroy_qp(struct ibv_qp *ibqp) +{ + int ret; + struct iwch_qp *qhp = to_iwch_qp(ibqp); + struct iwch_device *dev = to_iwch_dev(ibqp->context->device); + void *dbva, *wqva; + unsigned wqsize; + + PDBG("%s enter qp %p\n", __FUNCTION__, ibqp); + if (t3b_device(dev)) { + pthread_spin_lock(&qhp->lock); + iwch_flush_qp(qhp); + pthread_spin_unlock(&qhp->lock); + } + + dbva = (void *)((unsigned long)qhp->wq.doorbell & ~(iwch_page_mask)); + wqva = qhp->wq.queue; + wqsize = t3_wq_memsize(&qhp->wq); + + munmap(dbva, iwch_page_size); + munmap(wqva, wqsize); + ret = ibv_cmd_destroy_qp(ibqp); + if (ret) { + return ret; + } + + pthread_spin_lock(&dev->lock); + dev->qpid2ptr[qhp->wq.qpid] = NULL; + pthread_spin_unlock(&dev->lock); + + free(qhp->wq.rq); + free(qhp->wq.sq); + free(qhp); + return 0; +} + +int iwch_query_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, + int attr_mask, struct ibv_qp_init_attr *init_attr) +{ + return -ENOSYS; +} + +struct ibv_ah *iwch_create_ah(struct ibv_pd *pd, struct ibv_ah_attr *attr) +{ + return NULL; +} + +int iwch_destroy_ah(struct ibv_ah *ah) +{ + return -ENOSYS; +} + +int iwch_attach_mcast(struct ibv_qp *qp, const union ibv_gid *gid, uint16_t lid) +{ + return -ENOSYS; +} + +int iwch_detach_mcast(struct ibv_qp *qp, const union ibv_gid *gid, uint16_t lid) +{ + return -ENOSYS; +} + +void t3b_async_event(struct ibv_context *context, + struct ibv_async_event *event) +{ + PDBG("%s type %d obj %p\n", __FUNCTION__, event->event_type, + event->element.cq); + + switch (event->event_type) { + case IBV_EVENT_CQ_ERR: + break; + case IBV_EVENT_QP_FATAL: + case IBV_EVENT_QP_REQ_ERR: + case IBV_EVENT_QP_ACCESS_ERR: + case IBV_EVENT_PATH_MIG_ERR: { + struct iwch_qp *qhp = to_iwch_qp(event->element.qp); + pthread_spin_lock(&qhp->lock); + iwch_flush_qp(qhp); + pthread_spin_unlock(&qhp->lock); + break; + } + case IBV_EVENT_SQ_DRAINED: + case IBV_EVENT_PATH_MIG: + case IBV_EVENT_COMM_EST: + case IBV_EVENT_QP_LAST_WQE_REACHED: + default: + break; + } +} diff --git redhat/rdma-core.spec redhat/rdma-core.spec index e0b143364991..8909ae2a86b4 100644 --- redhat/rdma-core.spec +++ redhat/rdma-core.spec @@ -128,6 +128,8 @@ Summary: A library and drivers for direct userspace use of RDMA (InfiniBand/iWAR Requires(post): /sbin/ldconfig Requires(postun): /sbin/ldconfig Requires: %{name}%{?_isa} = %{version}-%{release} +Provides: libcxgb3 = %{version}-%{release} +Obsoletes: libcxgb3 < %{version}-%{release} Provides: libcxgb4 = %{version}-%{release} Obsoletes: libcxgb4 < %{version}-%{release} Provides: libefa = %{version}-%{release} @@ -160,6 +162,7 @@ fast path operations. Device-specific plug-in ibverbs userspace drivers are included: +- libcxgb3: Chelsio T3 iWARP HCA - libcxgb4: Chelsio T4 iWARP HCA - libefa: Amazon Elastic Fabric Adapter - libhfi1: Intel Omni-Path HFI diff --git redhat/rdma.kernel-init redhat/rdma.kernel-init index c7444a1c8d77..6f50e72fcc3b 100644 --- redhat/rdma.kernel-init +++ redhat/rdma.kernel-init @@ -125,6 +125,10 @@ load_hardware_modules() load_modules mlx5_ib RC+=$? fi + if is_loaded cxgb3 -a ! is_loaded iw_cxgb3; then + load_modules iw_cxgb3 + RC+=$? + fi if is_loaded cxgb4 -a ! is_loaded iw_cxgb4; then load_modules iw_cxgb4 RC+=$? diff --git suse/rdma-core.spec suse/rdma-core.spec index a32d8f9cb966..4113f2f6a390 100644 --- suse/rdma-core.spec +++ suse/rdma-core.spec @@ -182,6 +182,7 @@ RDMA core development libraries and headers. Summary: Library & drivers for direct userspace use of InfiniBand/iWARP/RoCE hardware Group: System/Libraries Requires: %{name}%{?_isa} = %{version}-%{release} +Obsoletes: libcxgb3-rdmav2 < %{version}-%{release} Obsoletes: libcxgb4-rdmav2 < %{version}-%{release} Obsoletes: libefa-rdmav2 < %{version}-%{release} Obsoletes: libhfi1verbs-rdmav2 < %{version}-%{release} @@ -211,6 +212,7 @@ fast path operations. Device-specific plug-in ibverbs userspace drivers are included: +- libcxgb3: Chelsio T3 iWARP HCA - libcxgb4: Chelsio T4 iWARP HCA - libefa: Amazon Elastic Fabric Adapter - libhfi1: Intel Omni-Path HFI
Locations
Projects
Search
Status Monitor
Help
OpenBuildService.org
Documentation
API Documentation
Code of Conduct
Contact
Support
@OBShq
Terms
openSUSE Build Service is sponsored by
The Open Build Service is an
openSUSE project
.
Sign Up
Log In
Places
Places
All Projects
Status Monitor